diff --git a/README.md b/README.md index a2cffe6..a4ea3ac 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,4 @@ ---- -title: "Machine Learning Methods for Orbital Debris Characterization: Report 2" -author: Anson Biggs -date: 2022-02-14 ---- +Most up to date version: https://projects.ansonbiggs.com/posts/2022-04-03-machine-learning-directed-study-report-2/ ## Gathering Data @@ -19,37 +15,39 @@ is an example of one of the satellites that was used. ## Data Preparation The models were processed in Blender, which quickly converted the -assemblies to `stl` files, giving 108 unique parts to be processed. Since -the expected final size of the dataset is expected to be in the +assemblies to `stl` files, giving 108 unique parts to be processed. +Since the expected final size of the dataset is expected to be in the magnitude of the thousands, an algorithm capable of getting the required properties of each part is the only feasible solution. From the analysis -performed in [Report 1](https://gitlab.com/orbital-debris-research/directed-study/report-1/-/blob/main/README.md), -we know that the essential part of the data is the moments of inertia +performed in [Report +1](https://gitlab.com/orbital-debris-research/directed-study/report-1/-/blob/main/README.md), +we know that the essential debris property is the moments of inertia which helped narrow down potential algorithms. Unfortunately, this is one of the more complicated things to calculate from a mesh, but thanks -to a paper from David Eberly in 2002 titled [Polyhedral Mass Properties](https://www.geometrictools.com/Documentation/PolyhedralMassProperties.pdf), -I could replicate his algorithm in the Julia programming language. The -current implementation of the algorithm calculates a moment of inertia -tensor, volume, and center of gravity in a few milliseconds per part. +to a paper from [@eberlyPolyhedralMassProperties2002] titled [Polyhedral +Mass +Properties](https://www.geometrictools.com/Documentation/PolyhedralMassProperties.pdf), +his algorithm was able to be implemented in the Julia programming +language. The current implementation of the algorithm calculates a +moment of inertia tensor, volume, and center of gravity in a few +milliseconds per part. ![Current Process](Figures/current_process.svg) -The algorithm\'s speed is critical not only for the eventually large +The algorithm's speed is critical not only for the eventually large number of debris pieces that have to be processed, but many of the data science algorithms we plan on performing on the compiled data need the data to be normalized. I have decided that it makes the most sense to normalize the dataset based on volume. I chose volume for a few reasons, -namely because it was easy to come up with an efficient algorithm to +namely because it was easy to implement an efficient algorithm to calculate volume, and currently, volume seems to be the least essential -property for the data analysis. Scaling all the models to have the same -volume can be done very efficiently using derivative-free numerical -root-finding algorithms. The current implementation can scale and -process all the properties using only 30% more time than getting the -properties without first scaling. Finding the correct scale is an -iterative process, so scaling may become significantly more expensive as -more complex models become available. +property for the data analysis. Unfortunately, scaling a model to have a +specific volume is an iterative process, but can be done very +efficiently using derivative-free numerical root-finding algorithms. The +current implementation can scale and process all the properties using +only 30% more time than getting the properties without first scaling. -```txt +```{.txt} Row │ variable mean min median max │ Symbol Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────────────────────────────── @@ -62,15 +60,16 @@ more complex models become available. 7 │ Iz 0.0111086 1.05596e-17 2.1906e-8 1.15363 ``` -Above is a summary of the current dataset without scaling. The max -values are well above the median, and given the dataset\'s small size, -there are still significant outliers in the dataset. For now, any +Above is a summary of the current 108 part dataset without scaling. The +max values are well above the median, and given the dataset's small +size, there are still significant outliers in the dataset. For now, any significant outliers will be removed, with more explanation below, but hopefully, this will not become as necessary or shrink the dataset as much as the dataset grows. As mentioned before, a raw and a normalized -dataset were prepared, and the data can be found here: +dataset were prepared, and the data can be found below: -[dataset.csv](https://gitlab.com/orbital-debris-research/directed-study/report-2/-/blob/main/dataset.csv), [scaled_dataset.csv](https://gitlab.com/orbital-debris-research/directed-study/report-2/-/blob/main/scaled_dataset.csv) +- [dataset.csv](https://gitlab.com/orbital-debris-research/directed-study/report-2/-/blob/main/dataset.csv) +- [scaled_dataset.csv](https://gitlab.com/orbital-debris-research/directed-study/report-2/-/blob/main/scaled_dataset.csv) ## Characterization @@ -82,7 +81,7 @@ different from the previous one, it is essential to ensure inertia is still the most important. We begin by using the `pca` function in Matlab on our scaled dataset. -```matlab +```{.matlab} [coeff,score,latent] = pca(scaled_data); ``` @@ -105,7 +104,7 @@ time will be spent analyzing the properties. Now that it has been determined that inertia will be used, k-means clustering can be performed on the raw, unscaled dataset. -```matlab +```{.matlab} [IDX, C] = kmeans(inertia,3); histcounts(IDX) % Get the size of each cluster @@ -119,7 +118,7 @@ groups. Therefore, to get a better view, only the smallest magnitude group will be kept since it seems to have the most variation and k-means will be performed again to understand the data better. -```matlab +```{.matlab} inertia = inertia(IDX == 1,:); [IDX, C] = kmeans(inertia,3); diff --git a/citations.bib b/citations.bib new file mode 100644 index 0000000..56e90d0 --- /dev/null +++ b/citations.bib @@ -0,0 +1,11 @@ + +@misc{eberlyPolyhedralMassProperties2002, + title = {Polyhedral {{Mass Properties}} ({{Revisited}})}, + author = {Eberly, David}, + year = {2002}, + month = dec, + copyright = {CC BY 4.0}, + url = "https://www.geometrictools.com/Documentation/PolyhedralMassProperties.pdf" +} + + diff --git a/report2AnsonBiggs.docx b/report2AnsonBiggs.docx new file mode 100644 index 0000000..cf15b48 Binary files /dev/null and b/report2AnsonBiggs.docx differ