PROGRAMMING

ASSIGNMENT 4

BBM203 Software Practicum I

Programming Assignment 4

Topics: Binary Search Trees, Red-Black Trees, KD-trees, kNN

classification, Recursion, Dynamic Memory Allocation, File I/O

Programming Language: C++11 – You MUST USE this starter code

Saving Dr. Elara: Mission Rescue

Following the groundbreaking decryption of

the Celestial Tapestry in Programming Assignment 1, a glimmer of hope emerged in

the search for the enigmatic Dr. Elara. The

tapestry, once thought to be a mere cosmic

anomaly, revealed itself as a beacon, guiding

us to the whereabouts of Dr. Elara, who vanished while exploring the uncharted depths of

space.

The tapestry’s hidden messages, a cryptic

blend of celestial coordinates and obscure

references to distant planets, have led to

the formulation of a daring rescue operation: Mission Rescue. This high-stakes venture, spearheaded by the same brilliant team

of HUBBM and HUAIN students who unraveled the tapestry’s secrets, aims to traverse the perilous expanses of space to retrieve Dr. Elara from where she is believed

to be stranded.

The path to Dr. Elara is fraught with challenges that demand not only courage but unparalleled scientific

and computational expertise. To navigate these unknown realms, three critical tasks have been identified,

forming the backbone of the mission’s strategy: planetary classification system, space sector mapping,

and space sector mapping optimization.

Mission Rescue is not just a simple retrieval mission; it is a journey into the unknown, a testament to

human ingenuity and the relentless pursuit of knowledge. As the team embarks on this perilous voyage,

they carry not only the hope of finding Dr. Elara but also the aspirations of unraveling the mysteries that

lie beyond our known universe.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

1 Task 1: Space Sector Mapping

The vastness of uncharted space requires a comprehensive mapping system. In this part, you

are tasked with implementing Binary Search Tree

(BST) algorithms to map space sectors along exploration routes. Your goal is to identify the path

to the final destination sector, where Dr. Elara is

believed to be stranded on one of the habitable

planets.

As the journey unfolds, new sectors may be discovered that need to be integrated into the space

map, while the sectors identified as dangerous must be removed from the map. This dynamic process will

enable the explorers to adjust the mission’s trajectory in real-time, ensuring a strategic and responsive

approach to the ever-evolving landscape of space exploration.

1.1 Input File and Sector Map Initialization

Space sectors are represented by their position relative to the Earth’s position on

a 3D coordinates (X, Y, Z) plane. An input file containing the coordinates of the

already discovered space sectors, structured as a DAT file, will be given as the first

command line argument. Your task is to parse this file within your program to set up

the initial BST that will represent the sector map within the SpaceSectorBST class.

The member pointer variable Sector *root must be set to point to the root node

of this tree. The content of a sample input file given as sectors.dat is illustrated

on the right.

X,Y,Z

0,0,0

10,20,30

-5,-15,10

25,5,0

-10,-20,-30

5,15,-10

-5,15,10

-25,-5,0

30,40,50

15,-25,35

0,30,-40

-10,-20,30

15,-20,35

Sector nodes should be inserted into the BST based on their (X, Y, Z) coordinates as follows:

• Primary, Secondary, and Tertiary Sorting Criteria: The primary sorting criterion is the Xcoordinate. If two sectors have the same X-coordinate, the Y-coordinate becomes the secondary

sorting criterion. If both X and Y coordinates are the same, the Z-coordinate is used as the tertiary

sorting criterion.

• Insertion Process: Start by comparing the X-coordinates of sectors. For sectors with the same

X-coordinate, compare their Y-coordinates. For sectors with identical X and Y coordinates, compare

their Z-coordinates.

• Tree Structure: A node’s left child has a smaller X-coordinate or, in case of a tie, a smaller

Y-coordinate, or, if both are tied, a smaller Z-coordinate. A node’s right child has a greater or equal

X-coordinate (and then Y, Z are considered similarly in case of ties).

You are expected to implement the node insertion recursively. In a recursive approach, the tree

is traversed from the root down to the appropriate null link, and the new node is inserted at that point.

The recursive function will:

• Take the current node and its coordinates as parameters.

• Begin the insertion with the root node of the BST. If the BST is empty (root is nullptr), the

new node becomes the root. Otherwise, compare the coordinates as explained above. If the sorting

criteria require inserting the sector to the left, proceed to the left child. Otherwise, proceed to the

right child.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

• Recursive Call: Make a recursive call to the insert function with the left or right child based on the

comparison. If the child is nullptr, create a new node with the given distance and attach it at

this position.

• Return the current node after each recursive call to maintain the tree structure. This ensures that

the root of the subtree (or the overall tree) is updated correctly after the insertion.

• The node returned from the initial call of the function (starting at the root) is assigned as the new

root of the BST.

Each sector can be uniquely identified within the BST map by its sector code that is calculated based

on its (X, Y, Z) coordinates and its distance from the Earth. To calculate the sector distance in Astro

Units (AUs) from the Earth based on their coordinates, you should use the Euclidean distance formula in

three dimensions: d =

q

(x2 − x1)

2 + (y2 − y1)

2 + (z2 − z1)

2

. The Earth is considered to have the origin

point of (0, 0, 0).

Sector Code Generation Logic:

• Distance Component: The first part of the sector code is derived from the sector’s distance from

Earth, truncated to integer (you can think of it as the floor operation). ← [updated]

• Coordinate Components: The next parts of the code are determined by the sector’s X, Y , and

Z coordinates.

– For each coordinate, if the coordinate value is 0 (same as the Earth’s), append the letter ‘S’

to the code, signifying ‘Same’ in that dimension.

– If the coordinate value is positive, append the respective directional letter: ‘R’ for Right (X),

‘U’ for Up (Y ), and ‘F’ for Forward (Z).

– If the coordinate value is negative, append the respective directional letter: ‘L’ for Left (X),

‘D’ for Down (Y ), and ‘B’ for Backward (Z).

An example: For a sector located at coordinates

(10, −5, 20) and 22.91 AUs

away from Earth, the distance

component is 22. The coordinate components are R for X

(positive), D for Y (negative),

and F for Z (positive). So,

the resulting sector code is

22RDF . ← [updated]

After reading the sample input provided in the given

sectors.dat file, and inserting the sector nodes into the

BST in order they appear in

the input file, while adhering to the sorting criteria and

calculating the sector codes

correctly, you should obtain

the sector map shown on the

right.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

1.2 Deleting Dangerous Sectors from the Sector Map

When a sector is deemed dangerous, it must immediately be removed from the map. A dangerous sector

will be identified by its sector_code. You are expected to implement the sector node deletion

recursively. To implement deleting a sector node from the sector tree map, you’ll need to follow the

standard procedure for deleting a node from a binary search tree. This involves finding the node with the

matching sector_code, then handling three cases if found:

• Node with no children (leaf node): Simply remove the node.

• Node with one child: Replace the node with its child, and then delete the node.

• Node with two children: Find the node’s in-order successor (smallest node in the right subtree),

swap it with the node, and then delete the node.

For instance, if the sector 37RUF is identified as dangerous, the resulting tree, shown on the right in

the figure below, illustrates the sector tree map after the removal of this sector.

0SSS

18LDF

18LUF 18RUB

25LDS

37RUF

25RUS

*root

37LDB

37LDF 50SUB 45RDF

43RDF

70RUF

0SSS

18LDF

18LUF 18RUB

25LDS

45RDF

25RUS

*root

37LDB

37LDF 50SUB 43RDF 70RUF

Delete “37RUF”

1.3 Finding a Path to Dr. Elara

In order to be able to search through the sector tree map, you need to implement three BST traversal

algorithms: inorder, preorder, and postorder. In a BST, inorder traversal retrieves elements in sorted

order, preorder traversal is useful for creating prefix expressions and operations involving processing the

root before the children, while postorder traversal is commonly employed for deleting nodes and processing

children before the root. Below, the expected traversal outputs are given for the initial sample tree:

Space sectors inorder traversal:

25LDS

37LDB

37LDF

18LDF

18LUF

0SSS

50SUB

18RUB

37RUF

45RDF

43RDF

25RUS

70RUF

Space sectors preorder traversal:

0SSS

18LDF

37LDB

25LDS

37LDF

18LUF

37RUF

18RUB

50SUB

25RUS

45RDF

43RDF

70RUF

Space sectors postorder traversal:

25LDS

37LDF

37LDB

18LUF

18LDF

50SUB

18RUB

43RDF

45RDF

70RUF

25RUS

37RUF

0SSS

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

In Programming Assignment 1, you successfully decoded a hidden message within the Celestial Tapestry.

This message revealed a sector code, believed to be a secret communication from Dr. Elara, notifying us

about her location in the cosmos. It’s now imperative that we organize a rescue mission to ensure Dr.

Elara’s safe return to Earth.

0SSS

18LDF

18LUF 18RUB

25LDS

37RUF

25RUS

*root

37LDB

37LDF 50SUB 45RDF

43RDF

70RUF

Your mission in this assignment is to navigate the

space sector tree map to determine the exact path

needed to reach the sector where Dr. Elara is

stranded. Envision space as interconnected by

wormholes, linking sectors directly in a manner reflected by the child links in our sector tree map.

Starting from Earth’s sector, the rescue team will

use these inter-sector connections to travel, or perform ‘space jumps’, from one sector to another,

ultimately reaching Dr. Elara’s location. You are

tasked with implementing a function that returns

a vector of pointers to Sector nodes, such that,

following these pointers in sequence will guide the

rescue team from Earth to Dr. Elara’s sector. For

instance, if based on the decoded message, we were

informed that Dr. Elara is in sector 45RDF . Using the initial sample tree structure (before any deletions),

the expected path (illustrated on the right) should be printed like this:

The stellar path to Dr. Elara: 0SSS->37RUF->25RUS->45RDF

In this example, the expected path to the intended sector necessitates three space jumps to reach the

destination from the Earth.

1.4 Challenges of Using Unbalanced BSTs in Space Sector Mapping

Now, imagine that the sector input file was given such that the sectors are

already sorted based on the insertion criteria. For example, let the sample

sectors_sorted.dat file be structured as shown on the right.

X,Y,Z

0,0,0

0,10,20

0,10,30

10,30,50

10,40,50

0SSS

22SUF

31SUF

*root

59RUF

64RUF

Using the approach we employed to generate the BST map could lead

to a completely imbalanced tree as illustrated on the left. This imbalance has significant drawbacks, notably prolonged search times. More

critically, it could compel the rescue team to undertake a considerably

lengthier stellar journey to reach the destination sector where Dr. Elara

is stranded. For instance, in this particular map, if the destination sector is 59RUF , the team would need to make three space jumps to get

there. In contrast, if the tree were balanced, they might only need one

or two jumps. The efficiency of the path is directly affected by the tree’s

structure, emphasizing the importance of a balanced tree in optimizing

the rescue mission.

For this reason, in the next task of this assignment, you are expected to

implement a better, balanced version of the sector tree map.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

2 Task 2: Optimizing Space Sector Mapping

In this task, you are expected to use the same input DAT file, given as the first command line argument,

parse the sectors by the given coordinates to implement a new sector tree map, but this time, you must

use a Left-Leaning Red-Black Tree (LLRBT) data structure instead of BST. The choice of an LLRBT is

crucial for this task due to its self-balancing properties.

Node Insertion into the Left-Leaning Red-Black Tree (LLRBT): In this task, you are expected

to implement recursive node insertion into the LLRBT. This tree structure is a variation of the standard

Red-Black Tree, with a specific emphasis on ensuring that red links lean left. The recursive function for

inserting a new sector node into the LLRBT will follow these steps:

• Parameter Handling: Take the current node and its coordinates as parameters.

• Initiation: Start the insertion with the root node of the LLRBT. If the tree is empty (root is

nullptr), the new node becomes the root and is colored BLACK. Otherwise, proceed based on the

sector coordinates. Insert to the left child if criteria dictate, otherwise to the right.

• Recursive Insertion: Make a recursive call to the insert function with the appropriate child (left or

right) based on the comparison. If reaching a nullptr position, create a new red node with the

given coordinates.

• Fixing Violations: After insertion, fix any violations of the LLRBT properties. This includes enforcing

left-leaning red links, balancing consecutive red links, and ensuring every path from a node to

descendant leaves contains the same number of black nodes. This is achieved through rotations

(left or right as needed) and color flips.

• Structure Maintenance: Return the current node after each recursive call. This step is crucial as

the subtree’s root (or the overall tree) might change due to rotations during the fixing process.

• Root Assignment: The node returned from the initial function call (starting at the root) becomes

the new root of the LLRBT.

Please note, the insertion process in an LLRBT differs significantly from both a standard Binary Search

Tree and a traditional Red-Black Tree. It requires specific operations such as left-leaning rotations and

color flips to maintain the unique properties of LLRBTs.

0SSS

22SUF

31SUF

*root

59RUF

64RUF

Remember the example from Section 1.4 where we obtained

a completely imbalanced, right-leaning tree? In that particular case, the destination sector was 59RUF , and the rescue team needed to make three space jumps to reach Dr.

Elara.

In contrast, by using a Red-Black Tree instead of a Binary

Search Tree, our new space sector map is balanced (see the

figure on the right), and the rescue team will only need to

perform two space jumps instead of three. Note that the parent pointers are represented as yellow arrows, while the node

color is shown as black or red node outlines. The efficiency

of the rescue path is directly affected by the tree’s structure,

emphasizing the importance of a balanced tree in optimizing

the rescue mission.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

2.1 Finding an Optimized Path to Dr. Elara

In order to be able to search through the sector tree map, you can use the same three BST traversal

algorithms: inorder, preorder, and postorder, implemented in the previous task. Below, the expected

traversal outputs are given for the obtained sample tree:

Space sectors inorder traversal:

BLACK sector: 0SSS

RED sector: 22SUF

BLACK sector: 31SUF

BLACK sector: 59RUF

BLACK sector: 64RUF

Space sectors preorder traversal:

BLACK sector: 59RUF

RED sector: 22SUF

BLACK sector: 0SSS

BLACK sector: 31SUF

BLACK sector: 64RUF

Space sectors postorder traversal:

BLACK sector: 0SSS

BLACK sector: 31SUF

RED sector: 22SUF

BLACK sector: 64RUF

BLACK sector: 59RUF

Note that this time, the color of the sector in the tree is also displayed.

0SSS

22SUF

31SUF

*root

59RUF

64RUF

Your mission in this task is to navigate the space sector tree

map to determine the exact path needed to reach the sector

where Dr. Elara is stranded, noting that the sector where the

Earth is located may not be the root sector node any longer.

Sectors are again connected by two-way wormholes, reflected

by the child and parent links in our new sector tree map.

Starting from Earth’s sector, the rescue team will use these

inter-sector connections to perform ‘space jumps’ from one

sector to another, ultimately reaching Dr. Elara’s location.

You are tasked with implementing a function that returns a

vector of pointers to Sector nodes, such that, following these

pointers in sequence will guide the rescue team from Earth to

Dr. Elara’s sector. For instance, if based on the decoded message, we were informed that Dr. Elara is in sector 59RUF .

Using the obtained sample tree structure, the expected path

(illustrated on the right) should be printed like this:

The stellar path to Dr. Elara: 0SSS->22SUF->59RUF

In this example, the expected path to the intended sector necessitates two space jumps to reach the

destination from the Earth.

0SSS

22SUF

31SUF

*root

59RUF

64RUF

In the second example illustrated on the right, we assume that

that Dr. Elara is in sector 31SUF , based on the decoded message. Using the obtained sample tree structure, the expected

path should be printed like this:

The stellar path to Dr. Elara: 0SSS->22SUF->31SUF

Via this example, we observe that the path may or may not visit

the root of our tree sector map. Another important note: make

sure that there are no duplicate Sector nodes in the path!

If a path cannot be found, the expected output is:

A path to Dr. Elara could not be found.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

3 BONUS Task 3: Planetary Classification System (for extra 20

pts.)

To embark on a quest to locate Dr. Elara, we start with classifying habitable planets from a collection of

planetary systems. For this, we will employ a k-Nearest Neighbors (kNN) classification, enhanced with a

KD-tree, to efficiently process this astronomical data. KD-tree comes in handy for calculating the nearest

neighbors for kNN classification algorithm, reducing the computational complexity from O(n) to O(logN)

on the average.

3.1 Prerequisite Knowledge

3.1.1 Classification:

Classification in machine learning is a form of supervised learning, where the algorithm learns from labeled

training data to categorize new observations. The aim is to predict discrete labels or categories based

on learned patterns and relationships in the data. In our application, the algorithm will classify planets

as ‘habitable’ or ‘not habitable’ using known examples. Here, we hold the assumption that planets with

similar properties with those of habitable ones are also potentially habitable.

3.1.2 kNN and kNN Classification:

k-Nearest Neighbors (kNN) is a straightforward yet effective method for classification. It assumes that

similar items exist in close proximity. In this method, a data point is classified based on the majority label

among its ‘k’ nearest neighbors in the feature space. The feature space refers to the n-dimensional space,

where each record/row in the dataset corresponds to a point. Here, each row contains n features and a

label. That is, the algorithm classifies a new instance by checking with the closes n nearest neighbors in

the same feature space. Based on the label of the majority in the neighborhood, label of the new instance

is predicted.

In practice, kNN does not learn a specific model for classification but rather memorizes the training set.

Upon receiving a new observation, kNN calculates distances to all training points, finds the ‘k’ closest

ones, and assigns the most frequent label among them to the new point.

3.1.3 KD-Tree and Its Use in kNN

• KD-Tree Overview: A KD-Tree, or K-Dimensional Tree, is a space-partitioning data structure

used for efficiently organizing points in a multi-dimensional space. The key advantage of KD-Trees

lies in their ability to reduce the computational complexity of searching for nearest neighbors, which

is particularly significant in high-dimensional spaces.

• Computational Cost: Naive kNN Search vs. KD-Tree Search: In a naive kNN search,

the computational cost to find the nearest neighbors for a query point involves calculating the

distance from the query point to every other point in the dataset. This process has a computational

complexity of O(N), where N is the number of points in the dataset.

In contrast, a KD-Tree organizes the data points in a tree structure, dividing the space into regions

at each node. This structure allows the kNN search to effectively eliminate large portions of the

search space that do not need to be explored, thus significantly reducing the number of distance

calculations. The average computational complexity for a kNN search in a well-balanced KD-Tree

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

is O(logN), making it much more efficient than the naive approach, especially as the size of the

dataset grows.

Figure 1 showcases the partitioned search space with a KD-tree and the partitions that can be

overlooked, making the search process more efficient.

Figure 1: KD Tree extracted from https://www.baeldung.com/cs/k-d-trees

3.1.4 Variance

Variance is a statistical measure that describes the spread of numbers in a dataset. It indicates how much

the values in the dataset deviate from the mean. In the context of the kd-tree, variance is used as a

criterion to assess whether further splitting of the data is beneficial.

In the context of our kd-tree algorithm, a low variance on a particular dimension suggests that the data

points are closely clustered together along that dimension. In the kd-tree construction process, if the

variance of the data points in the current node is below a certain threshold, it implies that the points

are similar enough to be grouped together, and the node can be turned into a leaf, stopping further

partitioning along that branch. This is called pre-pruning the tree.

Variance(σ

2

) = 1

N

X

N

i=1

(xi − µ)

2

where:

σ

2

is the variance

N is the number of data points

xi represents each individual data point

µ is the mean (average) of the data points

X is the summation symbol, used to sum the squared differences

between each data point and the mean

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

3.1.5 Normalization

Normalization is a fundamental data preprocessing step in machine learning and data mining. It involves

adjusting the range of your data so that different features contribute equally to the final results. In the

context of k-Nearest Neighbors (kNN) and KD-Trees, normalization is essential for ensuring that each

feature equally influences the distance calculations, thereby improving the accuracy and effectiveness of

the algorithm.

Standardization, a common normalization technique, transforms the data to have a mean of 0 and a

standard deviation of 1. This is particularly useful in cases where the features in your dataset are measured

in different units and have varying ranges. The mathematical formula for standardization is given by:

x

0 =

x − µ

σ

(1)

Where:

• x

0

is the standardized value.

• x is the original value.

• µ is the mean of the feature values.

• σ is the standard deviation of the feature values.

In this assignment, you will apply standardization to the features of the planetary data before using them

in the kNN algorithm and KD-tree construction. This process will ensure that each feature contributes

proportionally to the distance calculations, leading to more reliable classification results.

3.2 KD-Tree Construction Algorithm

Below, you can see the steps we will use for building the kd-tree within the context of this

PA.

1. Determine the Split Dimension: The first step involves choosing the dimension along which to

split the data. The preferred dimension is the one with the highest variance, as this is likely to lead

to a more balanced distribution of data points in each subtree, potentially resulting in a tree that

requires fewer splits.

2. Check for Stopping Condition: Before splitting, the algorithm evaluates the variance of the data

points along the chosen dimension. If this variance falls below a predefined threshold, the algorithm

stops dividing the data at this node and creates a leaf node, bundling together the data instances

that fall into that leaf node. This approach helps to prevent overly complex tree structures in areas

where the data is already densely clustered.

3. Determine the Split Value: The split value is calculated as the median of the data points along

the chosen split dimension. Selecting the median ensures a balanced split, ideally dividing the data

into two subsets with an equal number of points.

4. Split the Data: Divide the data into two groups based on the split value. Points below the median

in the split dimension form the left group, while points at or above the median form the right group.

The data point that corresponds to the split value itself is included in the right group to maintain

consistency.

5. Create an Internal Node: Construct a new internal node that stores the split dimension and

value. Note that internal nodes in the KD-Tree do not store actual data points. Instead, they serve

as decision points to guide the tree’s traversal.

6. Recursively Build Subtrees: Apply the process recursively to the left and right groups of data.

Continue this division until all data points are allocated to leaf nodes.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

7. Data Points in Leaf Nodes: It is important to note that all real data points in the KD-Tree are

stored exclusively in the leaf nodes. These leaf nodes act as ‘buckets’ that contain the actual data

points.

3.3 k-Nearest Neighbors Search using KD-Tree

The k-Nearest Neighbors (kNN) search algorithm in a KD-Tree efficiently finds the ‘k’ closest data points

(neighbors) to a given query point in a multi-dimensional space. The algorithm consists of the following

steps:

1. Traverse to the Relevant Leaf Node: Starting at the root, traverse the KD-Tree to find the leaf

node that the query point would fall into. This involves comparing the query point’s dimensions to

those of the current node and moving to the left or right child node accordingly, until a leaf node

is reached.

2. Search within the Leaf Node: In the leaf node, calculate the distance from the query point to

each point within the node’s data bucket. Keep track of the ‘k’ closest points in this bucket.

3. Backtrack to Explore Other Branches: After searching the leaf node, backtrack up the tree to

previously visited nodes to check if other branches need to be explored. This is necessary because

the nearest neighbors might lie across different partitions divided by the KD-Tree.

• For each node during backtracking, calculate the distance from the query point to the node’s

splitting dimension.

• If this distance is less than the distance to the furthest point in the current set of ‘k’ nearest

neighbors, explore the opposite branch of this node as it might contain closer points.

4. Update Nearest Neighbors List: If closer points are found in other branches during the backtracking process, update the list of ‘k’ nearest neighbors accordingly.

5. Complete the Search: If there is no closer points in the other branches, stop the search.

This approach ensures that the KD-Tree is thoroughly searched for the nearest neighbors, considering

both the initial path to the leaf node and potential closer neighbors in other parts of the tree.

3.4 Input File

An input file, in DAT format, with the information about the known planets will be given as the second

command line argument. This training file includes the space-separated planet features and a label that

states if it is habitable or not. Additionally, this file includes the threshold value to be used for kd-tree

construction, determining when to stop growing the tree. For testing, an instance of a planet’s features

with no label will be created to be used for classification (see an example usage in sample main.cpp).

Below is an excerpt from the input file showing its structure and some sample data:

# Planet Data File

# Features: Distance_from_Star, Planet_Mass, Average_Surface_Temp,

# Atmospheric_Oxygen, Water_Surface_Coverage, Orbital_Period, Star_Luminosity

# Label: Habitability

#

# Data

1.32,3.17,18.57,0.22,81.15,695.92,0.91,Habitable

1.57,3.79,-97.98,0.41,47.6,264.89,1.12,Habitable

1.40,5.29,-4.83,0.11,52.31,630.72,1.27,Not Habitable

1.31,7.53,41.75,0.24,25.05,543.67,1.35,Not Habitable

# Threshold

0.05

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

3.5 Task Implementation Steps

The steps to implement the given task are given below:

1. Parsing: Read the files in DAT format. Identify the features, label (habitability), and the threshold

value.

2. Normalization: Standardize the each feature seperately, as explained in 1.1.5.

3. KD-Tree Construction: Using the standardized data, construct the KD-Tree, considering the

threshold value to determine when to stop tree growth.

4. kNN Classification: Standardize each feature of the test datum using the corresponding feature’s

mean and standard deviation in the train data. Then, use the built kd-tree to find the nearest

neighbors and perform majority voting for classification on a given instance.

For the given sample input data, the expected KD-tree is illustrated in the figure below.

Habitable

Atmospheric Oxygen

*root

< -1.089 ≥ -1.089

Star Luminosity Avg. Surface Temp.

< -1.305 ≥ -1.305 < -1.42 ≥ -1.420

Not Habitable Habitable Not Habitable

INFO-CIRCLE

Note that with larger input data sizes, it is possible for more than one planet to be grouped

into any of the ’buckets’ in the KD-Tree. Once the variance in the data is below the

predefined threshold, we stop growing the tree. In such cases, we will have a bucket

containing multiple data points in the leaf nodes. Additionally, in this specific instance, you

observe that the resulting KD-Tree features only three dimensions, despite the availability

of more features in the input data. The selection of these dimensions is guided by the

variance in each feature, as detailed above. This approach ensures efficient and relevant

feature usage in the KD-Tree construction.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

4 Assignment Implementation Tasks and Requirements

In this section, we outline the classes and functions you are required to implement. As aspiring engineers,

it’s crucial to base your code on the provided starter code that offers a predetermined class structure. This

ensures code clarity, proper encapsulation, and facilitates unit testing. It’s imperative that you do not

alter the names or signatures of functions provided in the template files. Likewise, refrain from renaming

or changing the types of the specified member variables. Other than that, you’re free to introduce any

additional functions or variables as needed.

Sector Class

• Constructor:

Sector(int x, int y, int z)

– Initializes a sector based on the given coordinates. Calculates the sector’s distance from the Earth

and generates its unique sector code.

• Operators =, ==, ! =: Overloading the assignment and comparison operators.

• Destructor:

~Sector()

– Frees any dynamically allocated memory if necessary.

SpaceSectorBST Class

• Constructor:

SpaceSectorBST()

– Initializes a space sector BST instance with a nullptr root (already given).

• Functions:

void readSectorsFromFile(const std::string& filename)

– Reads sectors from the given input file and inserts them into the space sector BST map according

to the coordinates-based comparison criteria.

void insertSectorByCoordinates(int x, int y, int z)

– Instantiates and inserts a new sector into the space sector BST map according to the coordinatesbased comparison criteria.

void deleteSector(const std::string& sector_code)

– Deletes the sector, given by its sector code, from the space sector BST map.

void displaySectorsInOrder()

– Traverses the space sector BST map in-order and prints the sectors to STDOUT in the given format.

void displaySectorsPreOrder()

– Traverses the space sector BST map in pre-order traversal and prints the sectors to STDOUT in the

given format.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

void displaySectorsPostOrder()

– Traverses the space sector BST map post-order traversal and prints the sectors to STDOUT in the

given format.

std::vector<Sector*> getStellarPath(const std::string& sector_code)

– Finds the path from the Earth to the destination sector given by the sector_code and returns

this path as a vector or Sector pointers in the path.

void printStellarPath(const std::vector<Sector*>& path)

– Prints the stellar path obtained from the getStellarPath() function to STDOUT in the given

format.

• Destructor:

~SpaceSectorBST()

– Frees any dynamically allocated memory if necessary.

SpaceSectorLLRBT Class

• Constructor:

SpaceSectorLLRBT()

– Initializes a space sector BST instance with a nullptr root (already given).

• Functions:

void readSectorsFromFile(const std::string& filename)

– Reads sectors from the given input file and inserts them into the space sector LLRBT map according

to the coordinates-based comparison criteria.

void insertSectorByCoordinates(int x, int y, int z)

– Instantiates and inserts a new sector into the space sector LLRBT map according to the coordinatesbased comparison criteria.

The instructions given for the displaySectorsInOrder(), displaySectorsPreOrder(),

displaySectorsPostOrder(), getStellarPath(), and printStellarPath() functions

of the SpaceSectorBST class above, are also applicable for this class.

• Destructor:

~SpaceSectorLLRBT()

– Frees any dynamically allocated memory if necessary.

Bonus Part Classes

Please refer to the comments inside the relevant classes starter code for details.

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

Must-Use Starter Codes

You MUST use this starter (template) code. All headers and classes should be placed directly inside

your zip archive.

Grading Policy – You May Earn a Total of 120 pts.

• No memory leaks and errors: 10%

– No memory leaks: 5%

– No memory errors: 5%

• Implementation of the tasks: 80%

– Task 1 – Space Sector Mapping: 40%

∗ Reading input and properly implementing the corresponding BST structure with node

insertions: 15%

∗ Proper implementation of deletion from the BST: 10%

∗ Proper implementation of BST traversals: 5%

∗ Proper implementation of pathfinding: 10%

– Task 2 – Optimizing Space Sector Mapping: 40%

∗ Reading input and properly implementing the corresponding LLRBT structure with node

insertions: 25%

∗ Proper implementation of pathfinding: 15%

• BONUS Task 3 – Planetary Classification System: 20%

– Reading the input file and building the KD-Tree: 10%

– Searching the KD-Tree to extract the nearest neighbors and correctly classify a planet: 10%

• Output tests: 10%

Important Notes

• Do not miss the deadline: Friday, 29.12.2023 (23:59:59) .

• Save all your work until the assignment is graded.

• The assignment solution you submit must be your original, individual work. Duplicate or similar

assignments are both going to be considered as cheating.

• You can ask your questions via Piazza (https://piazza.com/hacettepe.edu.tr/fall2023/

bbm203), and you are supposed to be aware of everything discussed on Piazza.

• You must test your code via Tur3Bo Grader https://test-grader.cs.hacettepe.edu.tr/

(does not count as submission!).

• You must submit your work via https://submit.cs.hacettepe.edu.tr/ with the file hierarchy

given below:

– b<studentID>.zip

∗ KDT_Node.h <FILE>

∗ KD_Tree.cpp <FILE>

∗ KD_Tree.h <FILE>

∗ kNN.cpp <FILE>

∗ kNN.h <FILE>

∗ kNN_Data.h <FILE>

∗ kNN_DAT_Parser.h <FILE>

Programming Assignment 4 – Deadline: 29/12/2023 at 23:59:59

Hacettepe Computer Engineering – BBM203 Software Practicum I – Fall 2023

∗ Sector.cpp <FILE>

∗ Sector.h <FILE>

∗ SpaceSectorBST.cpp <FILE>

∗ SpaceSectorBST.h <FILE>

∗ SpaceSectorLLRBT.cpp <FILE>

∗ SpaceSectorLLRBT.h <FILE>

• You MUST use this starter code. All classes should be placed directly in your zip archive.

• This file hierarchy must be zipped before submitted (not .rar, only .zip files are supported).

Build and Run Configuration

Here is an example of how your code will be compiled (note that instead of main.cpp we will use our

test files):

$ g++ -g -std=c++11 *.cpp, *.h -o mission_rescue

Or, you can use the provided Makefile within the sample input to compile your code:

$ make

After compilation, you can run the program as follows:

$ make

$ ./mission_rescue sectors.dat simple_planet_train.dat

Exclamation-circle

The submissions will be subjected to a similarity check. Any submissions that

fail the similarity check will not be graded and will be reported to the ethics

committee as a case of academic integrity violation, which may result in the

suspension of the involved students.

## Reviews

There are no reviews yet.