Name	Name	Last commit message	Last commit date
parent directory ..
Fine_Tune_Model	Fine_Tune_Model
Inference	Inference
Plots	Plots
blocks	blocks
Astronomy_Overview.pptx	Astronomy_Overview.pptx
NormalCell.py	NormalCell.py
Plot_Redshift.py	Plot_Redshift.py
README.md	README.md

AstroMAE: Redshift Prediction Using a Masked Autoencoder with a Novel Fine-Tuning Architecture

AstroMAE is a novel approach for redshift prediction, designed to address the limitations of traditional machine learning methods that rely heavily on labeled data and feature extraction. Redshift is a key concept in astronomy, referring to the stretching of light from distant galaxies as they move away from us due to the expansion of the universe. By measuring redshift, astronomers can determine the distance and velocity of celestial objects, providing valuable insights into the structure and evolution of the cosmos.

Utilizing a masked autoencoder, AstroMAE pretrains a vision transformer encoder on Sloan Digital Sky Survey (SDSS) images to capture general patterns without the need for labels. This pretrained encoder is then fine-tuned within a specialized architecture for redshift prediction, combining both global and local feature extraction. AstroMAE represents the first application of a masked autoencoder for astronomical data and outperforms other vision transformer and CNN-based models in accuracy, showcasing its potential in advancing our understanding of the cosmos.

Data Description

This study utilizes data from the Sloan Digital Sky Survey (SDSS), one of the most comprehensive astronomical surveys to date. SDSS is a major multi-spectral imaging and spectroscopic redshift survey, providing detailed data about millions of celestial objects. The dataset used in this experiment is derived from previous work on the AstroMAE project. Specifically, it includes 1,253 images, each with corresponding magnitude values for the five photometric bands (u, g, r, i, z) and redshift targets. Each image has a resolution of 64 × 64 pixels, which are cropped from the center to a size of 32 × 32 pixels to be fed to the model.

AstroMAE Evaluation Metrics

AstroMAE is evaluated using multiple metrics to assess its performance comprehensively. These metrics include Mean Absolute Error (MAE), Mean Square Error (MSE), Bias, Precision, and R² score, offering a complete view of the model's prediction accuracy and reliability, particularly for redshift prediction tasks.

Scatter Plot: Predicted vs. Spectroscopic Redshift

The scatter plot visualizes the relationship between predicted redshift values (y-axis) and spectroscopic redshift values (x-axis). Each point represents a data sample, with the color indicating point density—warmer colors (yellow to red) denote regions of higher density. The red dashed line represents an ideal scenario (y = x), where predicted redshifts perfectly match spectroscopic values. Closer data points to this line imply better model predictions.

Metrics Explained

Mean Absolute Error (MAE): Measures the average magnitude of the errors between predicted and true values, providing insight into prediction accuracy.
Mean Square Error (MSE): Quantifies the average of squared errors, emphasizing larger deviations to highlight significant prediction errors.
Bias: Measures the average residuals between predicted and true values, indicating any systematic over- or underestimation in predictions.
Precision: Represents the expected scatter of errors, reflecting the consistency of the model's predictions.
R² Score: Evaluates how well the model predicts compared to the mean of true values; a value closer to 1 indicates better predictive performance.

Additional Metrics

total cpu time (second): Total time spent on CPU processing during execution, in seconds.
total gpu time (second): Total time spent on GPU processing during execution, in seconds.
execution time per batch (second): Average time taken to process each batch, in seconds.
cpu memory (MB): CPU memory usage during execution, in megabytes.
gpu memory (MB): GPU memory usage during execution, in megabytes.
throughput (bps): Data processing rate in bits per second across all batches.
batch size: Number of samples in each batch.
number of batches: The total number of batches processed in the execution.
device: The hardware device (CPU or CUDA) used for execution.

These metrics provide a detailed overview of AstroMAE's performance, emphasizing its effectiveness in redshift prediction tasks.

AI for Astronomy Inference Step-by-Step Guide

Overview

This guide provides step-by-step instructions for running inference on the AI for Astronomy project. This process is intended for both Windows and Mac users, though the screenshots and terminal commands shown here are from a Windows computer.

Prerequisites

Python installed (3.x version recommended).
Basic understanding of terminal usage.
Ensure Git is installed to clone the repository or download the code.

Step 1: Clone the Repository

You have two options to get the code:

Option A: Clone via Terminal

Copy the GitHub repository URL: UVA-MLSys/AI-for-Astronomy.
Open your terminal and navigate to the directory where you want to save the project.

Run the following command:

git clone https://github.com/UVA-MLSys/AI-for-Astronomy.git

Follow the prompts to enter your GitHub username, password, or authentication token if required.

Option B: Download as ZIP

From the GitHub page, click on "Download ZIP."
Extract the ZIP file by right-clicking on it in your file explorer and selecting "Extract All." Ensure that all files and their structure are maintained.

Step 2: Set Up the Directory

Save the extracted or cloned folder to the desired directory from which you will run the Python script.
If you are using Rivanna or any other computing platform, ensure the folder structure remains intact and accessible by the Python environment or IDE you plan to use.

Step 3: Update File Paths

Navigate to the following directory in your local project folder:

...\AI-for-Astronomy-main\AI-for-Astronomy-main\code\Anomaly Detection\Inference

Locate the inference.py file in this directory.

Step 4: Run the Inference Script

Open your terminal and navigate to the directory containing inference.py:

cd ...\AI-for-Astronomy-main\AI-for-Astronomy-main\code\Anomaly Detection\Inference

Run the inference script using the following command:
```
python inference.py
```
- The script may take about one minute to complete.
- If prompted for missing libraries, install them using pip. Ensure that the timm library version is 0.4.12.

Step 5: View Results

Once the script completes, navigate to the following directory:

...\AI-for-Astronomy-main\AI-for-Astronomy-main\code\Anomaly Detection\Plots

Open the following files to view the results:
- inference.png: This contains a visual representation of the inference results.
- Results.json: This JSON file contains the detailed numerical results of the inference.
Setting the Device
- To run the script on either GPU or CPU, set the --device argument accordingly:
  - For GPU: use 'cuda'
  - For CPU: use 'cpu'
- The default is set to run on CPU. To change the device, modify the --device argument as follows:
```
parser.add_argument('--device', type=str, default='cpu', help="Device to run the model on: 'cuda' for GPU or 'cpu' for CPU")
```

Troubleshooting

If you encounter issues with missing libraries, ensure you have installed all required packages by using pip install. The version of timm must be 0.4.12 to avoid compatibility issues.

Setting Up a Python Environment and Installing Packages

Follow these steps to create a Python virtual environment and install the necessary packages (numpy, torch, matplotlib, scipy, sklearn, timm).

Step 1: Create a Virtual Environment

To keep dependencies organized and avoid conflicts, it's recommended to create a virtual environment.
```
python -m venv myenv
```
This command creates a virtual environment named myenv.

Step 2: Activate the Virtual Environment

To activate the environment, run:
- On Windows:
```
myenv\Scripts\activate
```
- On macOS/Linux:
```
source myenv/bin/activate
```
Step 3: Install the Required Packages

Now install all the required packages:
```
pip install torch==2.2.1 timm==0.4.12
pip install matplotlib scikit-learn scipy
pip install numpy==1.23.5
```
Deactivating the Environment
```
deactivate
```

Notes

Ensure that all directory paths are properly set according to your system's file structure.
These instructions have been tested on both Windows and Mac systems, with only minor variations.

Project Folder Structure

Anomaly_Detection
├── Fine_Tune_Model
├── Inference
├── Plots
├── blocks
├── Astronomy_Overview.pptx
├── NormalCell.py
├── Plot_Redshift.py
├── README.pdf

Description of Each Folder and File

Folder/File	Description
Fine_Tune_Model	Contains model weights.
Inference	Code and data required for running inference.
Plots	Generated visualizations, such as plots of model evaluation metrics and analysis results.
blocks	Source code for fine-tuning.
Astronomy_Overview.pptx	PowerPoint presentation summarizing the astronomy aspects of the project.
NormalCell.py	Python implementation of standard and customized multi-head self-attention mechanisms.
Plot_Redshift.py	Script for generating visualizations and evaluations related to redshift analysis.
README.pdf	Detailed guide providing step-by-step instructions for running inference.

Support

Don't hesitate to get in touch with us:

Amirreza Dolatpour Fathkouhi: [email protected]
Kaleigh O'Hara: [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Anomaly Detection

Anomaly Detection

README.md

AstroMAE: Redshift Prediction Using a Masked Autoencoder with a Novel Fine-Tuning Architecture

Data Description

AstroMAE Evaluation Metrics

Scatter Plot: Predicted vs. Spectroscopic Redshift

Metrics Explained

Additional Metrics

AI for Astronomy Inference Step-by-Step Guide

Overview

Prerequisites

Step 1: Clone the Repository

Option A: Clone via Terminal

Option B: Download as ZIP

Step 2: Set Up the Directory

Step 3: Update File Paths

Step 4: Run the Inference Script

Step 5: View Results

Troubleshooting

Setting Up a Python Environment and Installing Packages

Step 1: Create a Virtual Environment

Step 2: Activate the Virtual Environment

Step 3: Install the Required Packages

Deactivating the Environment

Notes

Project Folder Structure

Description of Each Folder and File

Support

Files

Anomaly Detection

Directory actions

More options

Directory actions

More options

Latest commit

History

Anomaly Detection

Folders and files

parent directory

README.md

AstroMAE: Redshift Prediction Using a Masked Autoencoder with a Novel Fine-Tuning Architecture

Data Description

AstroMAE Evaluation Metrics

Scatter Plot: Predicted vs. Spectroscopic Redshift

Metrics Explained

Additional Metrics

AI for Astronomy Inference Step-by-Step Guide

Overview

Prerequisites

Step 1: Clone the Repository

Option A: Clone via Terminal

Option B: Download as ZIP

Step 2: Set Up the Directory

Step 3: Update File Paths

Step 4: Run the Inference Script

Step 5: View Results

Troubleshooting

Setting Up a Python Environment and Installing Packages

Step 1: Create a Virtual Environment

Step 2: Activate the Virtual Environment

Step 3: Install the Required Packages

Deactivating the Environment

Notes

Project Folder Structure

Description of Each Folder and File

Support