CS 3358 (Data Structures and Algorithms) by Lee S. Koh

Course Syllabus
Personnel and Logistical Information:

Instructional contact and availability: detailed under Who Where When and Instructor Schedule.

Timeline (of events fixed or more or less so): charted as Dates To Note.

Access to class materials: via a front page on TxState Canvas.


Tip: Click the Home tab near the upper left upon getting into the course.


Additional notes regarding use of TxState Canvas here (for CS3358-252 and CS3358-253, Spring 2025):


You may see Canvas use CS 3358.252 for the "CS3358-252 and CS3358-253 combined" course.
(I.e., CS3358-252 and CS3358-253 students share one course site that Canvas may indicate as CS 3358.002.)
CS3358-253 Students: don't be confused by Canvas use of CS 3358.252 (thinking you are at the wrong place).


Canvas will be used to manage assignments.



  Task descriptions, deliverables, due dates and times, . . . (Typically via externally linked pages.)



  Submission of deliverables. (Typically source files in as-is text format and print-to-PDF format.)



  Return of graded work. (Typically submitted source files in print-to-PDF format that have been marked.)



You are responsible for the successful and timely uploading of the proper files when submitting assignments.



  (In fairness to all, negative consequences of any failures to do so will be incontestably applied.)


Canvas will be used to administer in-class quizzes (ending 5 minutes of each normal, non-exam class meeting).



  (When addressing attendance/participation-related matters, your quiz-participation record on Canvas will serve as key reference.)



You will need to equip yourself to effectively access TxState Canvas and take such in-class quizzes.


As a rule, don't attempt to send any message to instructor via Canvas Inbox.



  (Violators are wholly responsible for any unfavorable consequences.)


In general, point values are accurate and pertinent only for each graded assignment/quiz in Canvas Gradebook.



(I.e., Canvas Gradebook is not set up for tasks like communicating exam scores and tracking aggregated scores/performances.)

Mode of class delivery: CS3358-252 and CS3358-253 are designated as face-to-face.


Where possible/appropriate, in-class activities to be recorded using Zoom-enabling equipment (targeting mainly the vocals).



  Recordings will hopefully serve as a helpful resource for absentee/rehash use.



  Recordings will be made available on TxState Canvas and accessible by students currently enrolled in CS3358-252 and CS3358-253.



Such use of Zoom-enabling equipment is not to be mistaken with the hosting of concurrent Zoom meetings for students to participate virtually.



  No hosting of concurrent Zoom meetings is contemplated or anticipated.

Office hours will be remotely via Zoom Video Communication on first-come-first-served basis.


Office Hour Meeting ID: 894 2532 4432



(The office hours are open to ≈150 CS2318/CS3358 students, so let's use them thoughtfully.)

As needed appointments or special-help group sessions: remotely via Zoom Video Communication.


Appointment requests via TxState email accounts.


Special-help group session announcements via TxState email accounts.

For identification purposes, display your first and last names as provided to the university when participating events remotely via Zoom Video Communication.


Also, you are expected to be equipped with a camera and microphone as some remote participations of events (e.g. online exams) are subject to additional constraints.

How disruptions to routine instruction will be handled.


Notify via TxState email accounts at the earliest possible.
Description and Learning Objectives:

Most computer-based solutions to nontrivial problems involve the manipulation of collections of data. To effectively and efficiently perform such manipulations, appropriate storage and access techniques are essential. In other words, the data collections must be suitably represented and organized (stored in structures of some kind), and suitable algorithms must be devised to operate on the stored data. Some such data structures and algorithms have proven, over the years, to be so natural and powerful that they have come to be regarded as classic mechanisms. An appreciation of the variety and uses of these mechanisms is now considered to be fundamental to the understanding and training of computer-based problem solving and software development. This course will introduce you to these mechanisms, in the context of firming up your grasp on the prevailing problem solving approaches. The goal is to enhance your computer-based problem solving and software development skills beyond what you have acquired in CS 1428 (a prerequisite to CS 2308) and CS 2308 (a prerequisite to this course).

Learning objectives:


Understanding Abstract Data Types: motivations and basic concepts.


Understanding of the behavior of basic data structures (lists, stacks, queues, trees (binary trees and tree traversals, height-balanced trees), graphs, hash tables).


Ability to analyze a problem and determine the appropriate data structure for the problem.


Understand the importance of data modeling and data structures in advanced programming.


Understand and analyze elementary algorithms: sorting, searching and hashing.


Ability to analyze the impact of data structures technique on the performance of algorithms (time and space complexity)/programs.


Deep understanding of recursion and its applications.


Data structure implementation issues. Understanding of dynamic versus array implementations of data structures, factors involved in deciding on an implementation technique.


Practice in writing modular programs using the data structures that have been studied.


Understanding the mechanics of code design, organization, and the development environment.


Understanding data structure implementation in C++ using header files and implementation files.


Prerequisites:

C or better in CS 2308 (Foundations of Computer Science II ) or equivalent.

C or better in MATH 2358 (Discrete Mathematics I).


Suggested Supplementary Reference (for self-consumption):
IMPORTANT: Overwhelmingly, the most important reference material consists of lecture notes and examples accessible online. Typically, some lecture notes (or parts of them) are skimmed over or skipped due to time. Students are expected to attend lectures (and take good notes) to be primed for what to expect in exams/quizzes and to do well.

Photo of book cover TITLE:
Data Structures and Other Objects Using C++
AUTHORS:
Michael Main and Walter Savitch
EDITION / PUBLISHER / YEAR / ISBN:
4th / Add. Wesley / 2011 / 0-13-212948-5


    
    
Platform and Programming Environment:





GNU C++ in Linux command-line environment (as supported by the computer science department).


Course Plan:


Lectures/Discussions (main bullets)
and
Descriptions (sub-bullets)
(contents guide meant to be adaptively applied, not rigidly followed) 

Readings/References
(C,S -> Chap.,Sec. of text)
(LN -> Lecture Note)



Introduction. class website






Class website.


Highlights -> administrivia, success tips, etc.





Big picture: CS - problem solving - data structures and algorithms - data types - abstraction. S1.1, S1.3
LN305 thru LN307
LN308s01, LN308s02





Object-based data type development using C++ -> review/augment by example. C2, C4
LN301 thru LN304
LN304s00 thru LN304s02






Data type notions/concepts, approaches and techniques/methodologies:
data type defined -> representable set of values and applicable set of operations.
data type how-to aspects -> consumers use and suppliers build, respectively:
  - what operations supported -> interface
  - how values represented and how operations implemented -> implementation.
one trendsetting observation on interface and implementation:
  - interface -> less prone to change than implementation.
  - good to separate interface from implementation -> information hiding.
  - reduces disturbance on client programs when implementation changes.
another trendsetting observation on interface and implementation:
  - relationship between interface and implementation -> one-to-many.
  - good to abstract away implementation -> abstract data type (ADT).
  - simplifies description/classification/evaluation/... of data type (structure).
contract-oriented data type design technique -> mimicking everyday client-supplier contract:
  - preconditions   
what to expect (as benefit) or what to guarantee (as obligation).
(one or the other, depending on whether client or supplier standpoint)
  - postconditions
  - invariant ->  what to maintain (to uphold rules that reveal/constrain/... the
properties/relationships/... of an object of the data type). 



C++ class what and how:
simpler class features and indigenous data types.
more involved class features and exogenous data types.






Set, bag (multiset) and list (sequence) ADTs. C3
LN308






Interface specifications: the what.


Implementations using fixed-sized and dynamic arrays (non-template): the how.


Variant: sorted list (sorted sequence) ADT.





Generic programming. S6.1 thru S6.3
LN311






Toward developing more generally-applicable software components.


C++ function and class templates, and S(tandard) T(emplate) L(ibrary).


Implementation of bag and list ADTs using fixed-sized and dynamic arrays (templated).





Basic algorithm analysis. S1.2, Appendix B
LN309, LN310






Space-time trade-offs and other practical considerations.


Best, average and worst case scenarios.


Big-O characterization.





Linked lists. C5
LN312, LN313, LN322






Motivation:
basic searching and sorting:
  - practical significance of searching and sorting.
  - linear/sequential/serial search and binary search.
  - selection, bubble and insertion sorting algorithms.
insert/delete and grow/shrink anomalies associated with (sorted) arrays.



What's gained and lost -> array vs linked list.


Pointer-based implementation of linked lists.


Implementation of ADTs using linked lists.


Variants -> doubly-linked list, circular linked list, etc.





Stack and queue ADTs. C7, C8
LN314 thru LN316






Restricted forms of list ADT with widely useful operational characteristics:
last-in-first-out (LIFO) -> order reversal.
first-in-first-out (FIFO) -> order echoing.



Select applications:
managing invocation flow of control -> system/call/runtime stack.
transforming/evaluating expressions.
breadth-first (level) traversal of non-linear structures.



Implementation using array and linked list.


Variant: priority queue ADT.




    
    

Recursion and recursive thinking -> divide-and-conquer + "results of division" are identical and smaller versions of "what gets divided". S9.1
LN317






Power of recursion.


Criteria for application.


Advantages and disadvantages.


Success tips for designing/implementing recursive algorithms.


Tail recursion and indirect recursion.





Trees. C5, S11.1
LN318 thru LN321






Introduction:
common nonlinear data structure.
  - motivation for binary search tree.
recursive definition and other properties.
  - tree vis-a-vis graph and linked list.
general tree -> binary tree -> binary search tree and heap:
  - from general to more and more specialized.
some tree terminologies.



Representation:
array implementation (complete binary tree).
pointer-based implementation.



Binary tree traversals:
breadth-first (level).
depth-first.



Binary search tree:
storage rule.
insert, search, remove.
balancedness.
underlying data structure for ADTs -> bag as example.



Heap:
storage and shape rules.
insert and reheap up, remove and reheap down.
underlying data structure for priority queue ADT.






More advanced sorting algorithms. S13.2, S13.3
LN323






Heap sort.


Merge sort.


Quick sort.





Hashing. S12.2 thru S12.4
LN324, LN324a






Introduction:
another storage/access technique -> for keyed (IDed) data only.
  - comparison with "sequential/serial" and "ordered/binary".
perfect hashing.
  - 100% temporal consideration (constant-time search), 0% spatial consideration.
hashing in general.
  - compromise between temporal and spatial efficiencies.
some terminology -> hash table, hash function, hash value, collision, etc.



Some intuitive perspectives:
essence of approach:
  - table-lookup -> much like dictionary lookup.
  - key mapped to index (into hash table).
  - table/dictionary/map ADT.
hash function -> fast, deterministic, "randomized".
placement and retrieval -> must follow same mapping and collision resolution.



Design factors/issues -> collision-reduction challenges:
table size -> load factor and rehash.
bucket size.
hash function.
collision resolution.
  - closed hashing (open addressing).
    ○ linear probing and primary clustering.
    ○ (time permitting) quadratic probing and secondary clustering, and double hashing.
  - open hashing -> chained hashing (separate chaining).



Search, insert and delete algorithms.





Other advanced topics (time permitting) -> height-balanced trees, graph, introduction to parallel computing, etc. (Some may best be covered at other appropriate junctures, not necessarily toward the end.) S11.3, C15
LN320a01a-c, LN325, LN326-8


Attendance and Quizzes:

Attendance is required. A short in-class quiz (administered via TxState Canvas) is slotted for the ending 5 minutes of each normal, non-exam class meeting. Your quiz-participation record (on Canvas) will be key reference when addressing attendance/participation -related matters. 


As incentive to discourage absentism, attendance will contribute up to 5% toward the overall grade (see also under Grading Criteria).



(1) Any requests for excused absences (for acceptable reasons with supporting documentation) must be received before Canvas quiz scores are finalized/recorded for the day.



(2) An unexcused absence will be assigned a quiz score of 0 and affect overall attendance. An excused absence will not receive a quiz score and will not affect overall attendance.



(3) To avoid having to deal with unexcused cases, especially the borderline ones, each student is allowed an up to 5% non-attendance cushion.


Missing the first class meeting is especially bad since pivotal "must-knows" (class website, class calendar, how class is administered, grading criteria, success tips, . . .) are made known and "missing class(es) is the bane of successful course completion" is underscored.


Assignments:

8 required assignments (typically).

(Repeated for emphasis) Assignments will be managed using TxState Canvas:


Task descriptions, deliverables, due dates and times, . . . (Typically via externally linked pages.)


Submission of deliverables. (Typically source files in as-is text format and print-to-PDF format.)


Return of graded work. (Typically submitted source files in print-to-PDF format that have been marked.)


You are responsible for the successful and timely uploading of the proper files when submitting assignments.


(In fairness to all, negative consequences of any failures to do so will be incontestably applied.)

In general, late work will not be accepted (and will earn no credit).


I reserve the right to make due-date/due-time extensions or relaxations where appropriate.


Exams:

3 required exams: 


Exam 1 (1hr 20min)


Exam 2 (1hr 20min)


Final (2hr 30min) - comprehensive (cumulative) but tend to emphasize on areas that have not been tested. 

All exams will regularly happen physically in-class (face-to-face) and be of the written type that are closed book/note/collaboration.


If (for factors like extenuating circumstances) exams must irregularly and virtually/remotely be administered, they will be on-line via TxState Canvas.



Such exams are subject to additional constraints with the following being key ones:



Students are to also meet virtually (via Zoom Video Communication) and have video on to enable some modicum of proctoring.



Questions will be presented one at a time (students cannot see the entire exam at once).



Students are not allowed to go back to previous questions already answered.


Grading Criteria:

Exams: 70% (20% Exam 1, 20% Exam 2, 30% Final).

Assignments: 25%. 

Quizzes (Attendance/Participation): 5%.


Withdrawal Policy:

Registrar's Office guidance: https://www.registrar.txstate.edu/resources/dropping-vs-withdrawing.html.

After the automatic W period, you are expected to check with me prior to dropping the class to see if you will receive a grade of W or F. 


Make-Up Exams, Tests or Quizzes:

Make-up exams will only be given under unexpected and truly severe situations, which must be supported by some official document. There will be no make-up quizzes. 


Academic Honesty:

Unless indicated otherwise, all work submitted is to be your individual work. Violations will be dealt with according to university policies - see Student Handbook pages 46-47 or look under here.

*** Important  Note ***


Any attempts at obtaining homework, project, or exam solutions from “note sharing sites” such as Chegg and CourseHero or from other sources are considered cheating and carry the same penalty. The department regularly monitors websites for posted solutions.


Course Related Material:

Students can access these via the course site on TxState Canvas. Students are responsible for keeping abreast of the latest postings.


Where expedient and appropriate, I will do my best to alert students (such as during class meetings or through mass emails) of upcoming, freshly added or updated postings.

*** Copyright  Note ***


Unless otherwise noted, the materials provided in conjunction with this course, including but not limited to the lecture notes, handouts, homeworks, exams, and source code, are protected by copyright and for the exclusive use of the students enrolled in the course. Allowing others to access any of this material by posting it on public repositories such as git or submitting it to “note sharing sites” such as Chegg and CourseHero (which encourage you to break the law and post copyrighted content you do not own) is expressly forbidden. Note that you are not allowed to publicly post any of this material even if you made modifications. This copyright protection extends past the end of the semester.


Guidelines, Success Tips, and Other Pertinent Issues:

These will be mentioned in class and/or posted on the course site. Students are responsible for noting these while attending class (one reason why class attendance is important) and by visiting the course site often. I typically spend some time on the first day of class highlighting these.


Collective (Mass), Outside-of-Meeting, Instructor-to-Students Communication:

Electronically via students' Texas State email accounts (using a mailing list created from students' Texas State NetIDs).


Special Needs:

Students with special needs (as documented by the Office of Disability Services) should have the instructor informed at the beginning of the semester/session.