Compiler-Design-Basics

COMPANY : CODETECH IT SOLUTIONS

NAME : KUNAL YADAV

INTERN ID : CT04DN1692

DOMAIN : C LANGUAGE

DURATION : 4 WEEKS

MENTOR : NEELA SANTHOSH KUMAR

As part of my internship, I developed a simple lexical analyzer using the C programming language. The main objective of the task was to design a program capable of reading an input file and identifying three primary lexical elements in a source code snippet: keywords, operators, and identifiers. This task helped reinforce my understanding of compiler design fundamentals, especially the lexical analysis phase.

I used Visual Studio Code (VS Code) as my development environment due to its support for C development, integrated terminal, and syntax highlighting features that helped in writing and testing the program effectively. The project was designed to process a text file simulating a basic C code snippet, where the program would scan each token and classify it based on predefined rules and lists.

The lexical analyzer worked by reading the input file character by character and separating tokens based on delimiters such as spaces, newlines, and special symbols. I used a predefined list of C keywords (e.g., int, float, if, while, etc.) and operators (e.g., +, -, =, *, /, etc.) to compare each token. If a token matched a keyword or operator, it was labeled accordingly; if it did not match any of them and followed the rules of naming (e.g., starting with a letter or underscore), it was classified as an identifier.

One of the initial challenges I faced was ensuring accurate tokenization, especially when dealing with compound operators (like ==, >=, !=) or separating identifiers from keywords that appeared consecutively. I overcame these issues with the help of ChatGPT and GitHub Copilot, which provided suggestions on how to handle different token patterns and edge cases more effectively.

Another issue was handling special characters and ensuring the program didn’t misclassify them. I used conditional checks and character classification functions (like isalpha() and isdigit()) to validate the structure of identifiers and ignore irrelevant characters like whitespaces or newlines.

INPUT FILR :

OUTPUT :

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
Task3.c		Task3.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compiler-Design-Basics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Compiler-Design-Basics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages