CD-HPF:New Habitability Score
Via Data Analytic Modeling
Snehanshu Saha
Professor., Computer Science Dept.,PESIT-BSC,
Bangalore.
INTRODUCTION
At present, only known habitable planet
is Earth.
2 important questions to be answered –
a. Can we have life’s existence exactly
similar to Earth , somewhere else? Or
b. Do we have life in unknown form
existing somewhere else?
Introduction..
a.
Astronomers uses mainly 2 important
parameters to answer these 2
questionsEarth’s Similarity Index (ESI)- based on
4 parameters radius, surface
temperature, density and escape
velocity and is given as
x x0
ESIx 1
x
x
0
w
Introduction..
b.
Planetary Habitability Index(PHI) which
is basically to estimate 2nd question
and is given asPHI S .E.C.L
1/4
Where S is Substrate, E is Energy, C is
Chemistry of compounds and L is liquid
solvent.
Introduction..
We are proposing an approach that
makes use of Cobb Douglas Production
Function to obtain the new habitability
score for exoplanets.
Cobb Douglas Production
Function (CDPF)
In general, CDPF is given as
Y k.x1 .x2
Where Y is the production function and α and
β are elasticity constants which are some
positive fractions , k is a positive constant
and x1 & x2 are input parameters.
NOTE: Function can be extended for any
number of inputs.
Elasticity Constants in CDPF
1.
2.
Sum of α & β helps us in deducing the
important results about the function ,
If their sum is 1, function is
homogeneous of degree 1, and is called
as Constant returns to Scale(CRS),
increase in 1 input gives increase in
output in same proportion.
If their sum is less than 1, it is
decreasing returns to scale (DRS),
where diminishing returns will set in.
Elasticity Constants in CDPF..
3.
If the sum is more than 1, it is
increasing returns to scale(IRS), here
output increases with variable factors.
Use of CDPF to estimate CDHS
We have formulated CDPF to estimate
Cobb Douglas Habitability Score(CDHS)
for exoplanets.
First, we have calculated interior and
surface CDHS for each exoplanet and
then used a convex combination of the
two to compute final CDHS.
While doing this we found and proved
that function is maximized for CRS &
DRS case.
Use of CDPF to estimate CDHS
We specify CDPF as
Y f ( R, D, Ts,Ve) R .D .Ts .Ve
where above parameters are radius, density ,
surface temperature & escape velocity with
their elasticity constraints.
Use of CDPF to estimate CDHS
Use of CDPF to estimate CDHS
Set criteria is to choose α,β,γ,δ so as to
maximize Y.
For CRS, where all the elasticities of
different cost components are equal,Y can
be specified as
Y
n
x
i
i
i 1
n
i 1
and
i 1
In such cases, Y is geometric mean of all
inputs.
Data set
For this work , we used HABCAT
database, available at
http://phl.upr.edu/projects/habitableexoplanets-catalog
From this database we selected 664
confirmed exoplanets for which surface
temperature was known.
Our Results
CDPF was applied to get CDHS based on
radius & density , we called it as CDHSi
and another CDHS based on escape
velocity and surface temperature and we
called it as CDHSs. We got data in the
range from 0.8607 to 168.35 for CDHSi
and from 1.01521 to 19.9395 for CDHSs.
Graphs for CDHS & CDHSs
i
CDHSi Here α & β values are 0.8 & 0.1
respectively.
Graphs for CDHS & CDHSs
i
CDHSs Here α & β values are 0.8 & 0.1
respectively.
CDHS Calculation
Next we calculated final CDHS asCDHS w ' CDHSi w"CDHSs
where w’ was considered as 0.99 & w”
as 0.01, because w’+w” must be equal to
1.
Values obtained for CDHS were in the
range from 0.87225 to 166.87.
Classification based on CDHS
After calculating CDHS , we applied
KNN classification algorithm to check
the number of planets belonging to
Earth’s Class.
According to CDHS result , 5 classes
are considered with k =7, where Earth’s
class was class 4th.
These classes have following ranges
according to classification done
KNN Algorithm
Consider k as the desired number of nearest neighbors and S:=p1,...,pn
be the set of training samples in the form pi=(xi,ci), where xi is the ddimensional feature vector (1 in our case) of the point pi and ci is the
class that pi belongs to. Let T:=p1’,p2’,…..pr’ be the testing samples.
for each p’=(x’,c’) in T
{
compute the distance d(x’,xi) between p’ & all pi belonging to S;
Sort pi according to d(x’,xi);
Select k closest training samples to p’ from the sorted list;
Assign a class to p’ based on majority vote ;
}
Class Ranges According to CDHS
Class1 - From 1.98 to 166.87
Class2 - Less than 1.98 to greater than
1.680
Class3 - From 1.68 to greater than
1.443
Class4 - From 1.443 to greater than
1.23
Class5 - From 1.23 to greater than
equal to 0.87225.
Result of Classification
80% of data is used
as training and 20%
data for testing.
In Class5, we
obtained 13
exoplanets.
Result plot of KNN
Accuracy obtained is
92.5%.
High Level Dataflow
Load the
dataset
Cobb
Douglas
Engine
Computed
CDHPF
Employ KNN
(preliminary
classification)
No. of
classes
Attribute
Filtering
Reorganized
Classes
Final 5 classes
Class 5 contains Earth &
other habitable planets
Planets highly likely to be habitable areGJ163c
GJ667C c
GJ832c
HD40307g
Kepler-186f
Kepler-62e
Kepler-62f etc.
Acknowledgement
I sincerely thank Ms. Kakoli Bora and Ms.
Surbhi Agrawal, my Ph.D. students for
sincere efforts. I acknowledge Dr.
Margarita Safanova , Indian Institute of
Astrophysics , who has been an excellent
collaborator in this project.
THANK YOU
© Copyright 2026 Paperzz