Project Overview
The virtual assistant combines computer vision, speech recognition, natural language processing, and system interaction to automate various user tasks, such as:
- Face recognition for identification.
- Voice-based command execution.
- System automation (shutdown, restart, sleep).
- Accessing online resources (Google, YouTube, Gmail, Maps, etc.).
- Providing system information (battery percentage, username, etc.).
- Enhancing user interaction through jokes, music, and personalized greetings.
The integration of such features makes it an interactive project for personal use or a demonstration of AI-based automation.

Key Functionalities
a. Face Recognition
Libraries Used: cv2
for webcam operations, face_recognition
for image processing and face matching.
Steps:
- A reference image (e.g., yash.jpg) is loaded using
face_recognition
. - The face encoding of the reference image is computed.
- The webcam captures frames, and faces are compared to the reference encoding.
- If a match is found, a confirmation message is played using text-to-speech (
pyttsx3
).
Application: This serves as a security feature to identify the user before executing commands.
b. Voice Assistant and Command Handling
Libraries Used: speech_recognition
for converting speech to text, pyttsx3
for converting text to speech.
Steps:
- The assistant listens for user input via the microphone.
- The speech input is converted to text using Google's Speech-to-Text API.
- The assistant matches the text with predefined commands to perform specific actions.
Examples of Commands:
- Greeting the user: "Hi Yash, how are you?"
- Fetching system information: "Tell me the battery percentage."
- Opening apps and websites: "Open Google" or "Search YouTube for cats."
- System control: "Shutdown the computer" or "Put it to sleep."
c. System Automation
Tasks Automated:
- Shutdown, Restart, and Sleep: Uses
os.system
commands andpyautogui
for GUI automation. - Taking Screenshots: Captures and saves screenshots using
pyautogui.screenshot
. - Switching Windows: Uses keyboard shortcuts like Alt + Tab to switch between open windows.
d. Integration with Online Resources
Google Search: Opens the browser with search results for a query. Example: "Google search machine learning."
YouTube Search: Opens YouTube and searches for a specified topic. Example: "Search YouTube for Python tutorials."
Gmail, Maps, and Social Media: Automates navigation to web pages like Gmail, Facebook, and Google Maps.
e. System and Personal Information
Fetching Battery Status: Uses psutil
library to get the battery percentage and charging status. Alerts the user when the battery is low or charging is required.
Fetching Username: Retrieves the username of the logged-in user using psutil.users
.
f. Entertainment Features
- Music Playback: Plays a specific song from a predefined path.
- Telling Jokes: Uses the
pyjokes
library to fetch and speak a random joke. - Interactive Greetings: Welcomes the user with the current date and time.
g. Advanced Features
- Face Recognition Integration: Ensures tasks are executed only after face verification.
- Hand Gesture Tracking (Placeholder): Future support for gesture-based commands.
- FB Chat Messaging (Commented Out): Placeholder to handle Facebook messaging using the
fbchat
library. - Alarm Feature: Plays music as an alarm based on specific triggers.

Project Link
Explore the complete code and documentation: Sparkle Voice Assistant on GitHub