Emergent Laws about the stability and predictive performance of sets of “similar” laws and about features of other sets of similar objects we us to control the learning process. Meta Laws replace stochastic assumptions in emergent law based statistics.
Emergent Laws were searched for many different prediction tasks in different databases. In the process, the data was only used up to the point in time t in the search process and if a law (with size of the set of emergence T) was found at time t, the prediction was made that the so far always observed pattern will appear again at t+T.
Afterwards it was counted, how often the predictions with DiV-times (up to point t) confirmed laws were true. That way, the rate of true predictions – also called the Reliablity – was empirically determined:
Rel(DiV) = Number of true Predictions / Total Number of Predictions
The following diagram shows the observed rate of true predictions for a diverse range of different prediction tasks.
- bank: Does a customer invest into a term deposit at a bank?
- bike: number of rented bikes in the next hour
- temp: temperature in the next hour in Washington DC
- hum: humidity in the next hour in Washington DC
- credit: Does a customer pay the credit as per contract?
- SuP_Vola: volatility of the S&P 500 at the next day
- soccer_tordiff: goal difference between home team and away team
- soccer_punkte: points of the home team
- lendingclub_rendite: return of a loan
- lendingclub_pd: Does a customer of Lending Club pay the credit as per contract?
- lendingclub_lgd: If a customer does not pay as per contract, how much will the creditor loose?
As the table shows, patterns exist, which were observed more than 32,768 times. Furthermore, from the 25,971 predictions made by these laws, only 1 prediction was wrong. It seems that it is possible to find a DiV, for which all predictions were true so far. In our opinion this is the closest possible approximation to metaphysical truth.
Moreover, it can be recognized that the Rel(DiV) (rates of true predictions) shows the same structure for all cases:
The higher the DiV, the higher the Rel.
It should be noted that a wrong prediction indicates that a pattern, never observed so far, happened for the first time. The number (1-Rel) shows the relative frequency of first-time occurrences of new patterns.
In the above examples it can be assumed that no opponent exists who reacts on predictable patterns and disturbs them. For problems like the prediction of returns of share indices or payouts for soccer bets it should be assumed that one plays against an opponent. Also in these cases the Reliability had fundamentally the same properties, but the predictive quality was always structurally lower.
There are several criteria (like the above) that are useful for differentiating between prediction problems. Also there can be used different “selection mechanisms” for laws. Empirical results about the rate of true predictions of different selection mechanisms for laws and heuristics for different problems are the basis for the law selection process of our learning systems.
In this way, the empirical results from different applications are accessible for our learning system to guide the learning process and this replaces the usual assumptions about stochastic distributions.
The empirical fundament of our approach to machine learning consists of laws about the predictive performance of laws.
Perhaps the most important law is the following one:
There was always a number of predictions T, for which the Reliability of higher confirmed laws was greater than the Reliability of lower confirmed laws.
The adjacent table shows for example that in every sequence of 10 predictions with laws, confirmed 4096 times, the rate of true predictions was always higher compared to predictions made by laws, confirmed 1-2 times, during the same chronological period.
We call this property „universal dominance of prediction laws“ or short UDPL.
This is the basis for the universal dominance of a very important prediction heuristic.
“Regardless of the specific prediction problem, it was so far always better (from an emergent perspective) to use higher confirmed laws for predictions – therefore, in any new prediction problem also choose the higher confirmed laws.”