Whenever we have to use video output in any IoT or robotics project then Esp32-Cam module comes into mind. It is the one which can fulfil all the needs and has comparatively low price than other modules available. The good this is we can use SD card to store the video and images or we can stream to another device using Wi-Fi.

There are some other options also you may try like integrating the module with blink IoT, cloud or telegram to get the data anywhere on internet. In this tutorial we will set up a server for Esp32 camera and then implement the same with telegram bot. Using server the access is limited within Wi-Fi range but with telegram we can get the images over internet.

Esp32 Sense:

Seeed Studio XIAO ESP32S3 Sense integrates a camera sensor, digital microphone, and SD card support. Combining embedded ML computing power and photography capability, this development board can be your great tool to get started with intelligent voice and vision AI. See more in the datasheet of ESP32S3.

· It has detachable OV2640 camera sensor for 1600*1200 resolution, compatible with OV5640 camera sensor.

· It supports programming in Arduino and Micro-Python.

· Support 2.4GHz Wi-Fi and BLE dual wireless communication, support 100m+ remote communication when connected with U.FL antenna

Features:

Components required:

ESP32S3 by XIAO Is overpowered, it has onboard programmer, battery monitoring circuit and camera hardware. So no need of external programmer like other Cam modules. Just plug the board to PC through USB and we are ready to go. You can connect an external battery to ESP32 directly , charging imitated automatically with a max of 100mA current. Only Lithium ion and Li-po battery support this features.

1) ESP32S3- SENSE by XIAO

(You can try Seeed Fusion PCB Assembly service from here)

2) Battery

TINY-ML to ESP32S3:

Using this board many machine learning projects can be made, few are listed here. You can replicate these project in order to learn Tiny-ML from scratch. To make any type of prototyping project using ESP32-Sense you can send your designed 3D-prints and PCB to JLCPCB for fabrication. JLCPCB is China based PCB manufacturer having more than 10 years of experience in this field. JLCPCB recently launched many other prototyping services like RF PCB, High precision PCB, 3D printing and Metal CNC.

I always suggest JLCPCB because it is the only manufacturer provide services in very low prices (5pcs of 2-layer PCB in just $2) and quality is also very good. Sign-up now using this link to JLCPCB and get free coupons up to $54 for next projects.

Upgrade to ESP32s3:

This board has a very small form factor which is great but due to having overperforming CPU it dissipates a lot of heat. Which can be felt in the back side of module. Continuous video streaming may damage the CPU because of high power consumption (350mA) so it is suggested to add a small heatsink with insulated thermal pad on the back of side PCB.

Setting up the Telegram bot:

It is basically a bot making procedure. We have to make a bot which send command to ESP32 over internet and then we can demand for images over internet through commands.

Step1: Open the Telegram App and search for “BotFather” (https://t.me/BotFather) then send command as:

1) Start the bot: /start

2) For new bot: /newbot

3) Name your bot, mine was as: SENSE

4) Give ID to Bot should be ending with bot, mine was as: Sense_32bot

This will generate an HTTP API key and a bot messenger link. API you can use in your code to implement telegram bot and through the bot messenger you can send the commands to ESP32S3_Sense.

Step2: Get the ID of your Telegram using “IDBot” (https://t.me/myidbot) this id is unique to your account only. Send command as:

1) /start

2) /getid

This will give your telegram message ID which is used in the Arduino program later on. Here is the command line of my telegram bot I named as "Sense".

Code:

It is to be noted that this program is modified according to the camera pins arrangement on the ESP32S3_sense board. The code has two files so please make sure to use the downloaded code from my Github.

#include <Arduino.h>
#include <WiFi.h>
#include <WiFiClientSecure.h>
#include "soc/soc.h"
#include "soc/rtc_cntl_reg.h"
#include "esp_camera.h"
#include <UniversalTelegramBot.h>
#include <ArduinoJson.h>

const char* ssid = "sagar";
const char* password = "12345678";

// Initialize Telegram BOT
String BOTtoken = "6032172596:AAH1Wmi-BcnI45kn5XVcIlkveGBVfp6BRjM";  // your Bot Token (Get from Botfather)

// Use @myidbot to find out the chat ID of an individual or a group
// Also note that you need to click "start" on a bot before it can
// message you
String CHAT_ID = "1086823712";

bool sendPhoto = false;

WiFiClientSecure clientTCP;
UniversalTelegramBot bot(BOTtoken, clientTCP);

#define FLASH_LED_PIN 4
bool flashState = LOW;

//Checks for new messages every 1 second.
int botRequestDelay = 1000;
unsigned long lastTimeBotRan;

//CAMERA_MODEL_XIAO_SENSE
#define PWDN_GPIO_NUM     -1
#define RESET_GPIO_NUM    -1
#define XCLK_GPIO_NUM     10
#define SIOD_GPIO_NUM     40
#define SIOC_GPIO_NUM     39

#define Y9_GPIO_NUM       48
#define Y8_GPIO_NUM       11
#define Y7_GPIO_NUM       12
#define Y6_GPIO_NUM       14
#define Y5_GPIO_NUM       16
#define Y4_GPIO_NUM       18
#define Y3_GPIO_NUM       17
#define Y2_GPIO_NUM       15
#define VSYNC_GPIO_NUM    38
#define HREF_GPIO_NUM     47
#define PCLK_GPIO_NUM     13

void configInitCamera(){
  camera_config_t config;
  config.ledc_channel = LEDC_CHANNEL_0;
  config.ledc_timer = LEDC_TIMER_0;
  config.pin_d0 = Y2_GPIO_NUM;
  config.pin_d1 = Y3_GPIO_NUM;
  config.pin_d2 = Y4_GPIO_NUM;
  config.pin_d3 = Y5_GPIO_NUM;
  config.pin_d4 = Y6_GPIO_NUM;
  config.pin_d5 = Y7_GPIO_NUM;
  config.pin_d6 = Y8_GPIO_NUM;
  config.pin_d7 = Y9_GPIO_NUM;
  config.pin_xclk = XCLK_GPIO_NUM;
  config.pin_pclk = PCLK_GPIO_NUM;
  config.pin_vsync = VSYNC_GPIO_NUM;
  config.pin_href = HREF_GPIO_NUM;
  config.pin_sscb_sda = SIOD_GPIO_NUM;
  config.pin_sscb_scl = SIOC_GPIO_NUM;
  config.pin_pwdn = PWDN_GPIO_NUM;
  config.pin_reset = RESET_GPIO_NUM;
  config.xclk_freq_hz = 20000000;
  config.pixel_format = PIXFORMAT_JPEG;

  //init with high specs to pre-allocate larger buffers
  if(psramFound()){
    config.frame_size = FRAMESIZE_UXGA;
    config.jpeg_quality = 10;  //0-63 lower number means higher quality
    config.fb_count = 2;
  } else {
    config.frame_size = FRAMESIZE_SVGA;
    config.jpeg_quality = 12;  //0-63 lower number means higher quality
    config.fb_count = 1;
  }
  
  // camera init
  esp_err_t err = esp_camera_init(&config);
  if (err != ESP_OK) {
    Serial.printf("Camera init failed with error 0x%x", err);
    delay(1000);
    ESP.restart();
  }

  // Drop down frame size for higher initial frame rate
  sensor_t * s = esp_camera_sensor_get();
  s->set_framesize(s, FRAMESIZE_CIF);  // UXGA|SXGA|XGA|SVGA|VGA|CIF|QVGA|HQVGA|QQVGA
}

void handleNewMessages(int numNewMessages) {
  Serial.print("Handle New Messages: ");
  Serial.println(numNewMessages);

  for (int i = 0; i < numNewMessages; i++) {
    String chat_id = String(bot.messages[i].chat_id);
    if (chat_id != CHAT_ID){
      bot.sendMessage(chat_id, "Unauthorized user", "");
      continue;
    }
    
    // Print the received message
    String text = bot.messages[i].text;
    Serial.println(text);
    
    String from_name = bot.messages[i].from_name;
    if (text == "/start") {
      String welcome = "Welcome , " + from_name + "\n";
      welcome += "Use the following commands to interact with the ESP32-CAM \n";
      welcome += "/photo : takes a new photo\n";
      welcome += "/flash : toggles flash LED \n";
      bot.sendMessage(CHAT_ID, welcome, "");
    }
    if (text == "/flash") {
      flashState = !flashState;
      digitalWrite(FLASH_LED_PIN, flashState);
      Serial.println("Change flash LED state");
    }
    if (text == "/photo") {
      sendPhoto = true;
      Serial.println("New photo request");
    }
  }
}

String sendPhotoTelegram() {
  const char* myDomain = "api.telegram.org";
  String getAll = "";
  String getBody = "";

  camera_fb_t * fb = NULL;
  fb = esp_camera_fb_get();  
  if(!fb) {
    Serial.println("Camera capture failed");
    delay(1000);
    ESP.restart();
    return "Camera capture failed";
  }  
  
  Serial.println("Connect to " + String(myDomain));


  if (clientTCP.connect(myDomain, 443)) {
    Serial.println("Connection successful");
    
    String head = "--Electro\r\nContent-Disposition: form-data; name=\"chat_id\"; \r\n\r\n" + CHAT_ID + "\r\n--Electro\r\nContent-Disposition: form-data; name=\"photo\"; filename=\"esp32-cam.jpg\"\r\nContent-Type: image/jpeg\r\n\r\n";
    String tail = "\r\n--Electro--\r\n";

    uint16_t imageLen = fb->len;
    uint16_t extraLen = head.length() + tail.length();
    uint16_t totalLen = imageLen + extraLen;
  
    clientTCP.println("POST /bot"+BOTtoken+"/sendPhoto HTTP/1.1");
    clientTCP.println("Host: " + String(myDomain));
    clientTCP.println("Content-Length: " + String(totalLen));
    clientTCP.println("Content-Type: multipart/form-data; boundary=Electro");
    clientTCP.println();
    clientTCP.print(head);
  
    uint8_t *fbBuf = fb->buf;
    size_t fbLen = fb->len;
    for (size_t n=0;n<fbLen;n=n+1024) {
      if (n+1024<fbLen) {
        clientTCP.write(fbBuf, 1024);
        fbBuf += 1024;
      }
      else if (fbLen%1024>0) {
        size_t remainder = fbLen%1024;
        clientTCP.write(fbBuf, remainder);
      }
    }  
    
    clientTCP.print(tail);
    
    esp_camera_fb_return(fb);
    
    int waitTime = 10000;   // timeout 10 seconds
    long startTimer = millis();
    boolean state = false;
    
    while ((startTimer + waitTime) > millis()){
      Serial.print(".");
      delay(100);      
      while (clientTCP.available()) {
        char c = clientTCP.read();
        if (state==true) getBody += String(c);        
        if (c == '\n') {
          if (getAll.length()==0) state=true; 
          getAll = "";
        } 
        else if (c != '\r')
          getAll += String(c);
        startTimer = millis();
      }
      if (getBody.length()>0) break;
    }
    clientTCP.stop();
    Serial.println(getBody);
  }
  else {
    getBody="Connected to api.telegram.org failed.";
    Serial.println("Connected to api.telegram.org failed.");
  }
  return getBody;
}

void setup(){
  WRITE_PERI_REG(RTC_CNTL_BROWN_OUT_REG, 0); 
  // Init Serial Monitor
  Serial.begin(115200);

  // Set LED Flash as output
  pinMode(FLASH_LED_PIN, OUTPUT);
  digitalWrite(FLASH_LED_PIN, flashState);

  // Config and init the camera
  configInitCamera();

  // Connect to Wi-Fi
  WiFi.mode(WIFI_STA);
  Serial.println();
  Serial.print("Connecting to ");
  Serial.println(ssid);
  WiFi.begin(ssid, password);
  clientTCP.setCACert(TELEGRAM_CERTIFICATE_ROOT); // Add root certificate for api.telegram.org
  while (WiFi.status() != WL_CONNECTED) {
    Serial.print(".");
    delay(500);
  }
  Serial.println();
  Serial.print("ESP32-CAM IP Address: ");
  Serial.println(WiFi.localIP()); 
}

void loop() {
  if (sendPhoto) {
    Serial.println("Preparing photo");
    sendPhotoTelegram(); 
    sendPhoto = false; 
  }
  if (millis() > lastTimeBotRan + botRequestDelay)  {
    int numNewMessages = bot.getUpdates(bot.last_message_received + 1);
    while (numNewMessages) {
      Serial.println("got response");
      handleNewMessages(numNewMessages);
      numNewMessages = bot.getUpdates(bot.last_message_received + 1);
    }
    lastTimeBotRan = millis();
  }
}

How to upload code:

To upload the code first download the ESP32 board package in your Arduino IDE by adding The below given JSON file to the preferences under files menu. Then go to the board manager under tools menu and download the ESP32 boards. https://raw.githubusercontent.com/espressif/arduino-esp32/gh-pages/package_esp32_index.json

Choose the board as XIAO ESP32S3 and Enable the PSRAM before uploading.

Choose the right COM port and hit upload. After uploading open the serial monitor you will get IP address for video stream over Wi-Fi client or you can command directly through telegram.

Testing the code:

After uploading when I started my own made telegram bot, I got two options one for Flash and other one for image. This small board don’t have any flash but you can use a LED on pin number 4 as mentioned in the program. The image quality is amazing with it’s 2MP camera sensor. Here are some image samples taken from the camera inside my room.

Image 1: 

Image 2:

Camera quality is quite good and it also has face recognition which can be implemented later on.