Welcome Guest! To enable all features please Login. New Registrations are disabled.

Notification

Icon
Error

Login


2 Pages12>
Options
Go to last post Go to first unread
Offline anatolsen  
#1 Posted : 08 June 2021 12:15:55(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Colleagues.
I have a problem when using it. And I'm not very good at math.
I need to with the sum of three normal distributions.
I have determined the approximating function of 9 parameters, but the iteration of the parameter values to find the minimum of the smallest sum of squares does not go. The program writes that my expression is set incorrectly.
Meaning - Sq(h1,h2, h3, µ1,µ2,µ3,σ1,σ2,σ3)
Could you tell me what I did wrong?
Solution file (01.sm) I attach - 01.sm (32kb) downloaded 17 time(s)..
P.S. I'm sorry, I don't speak English very well

Wanna join the discussion?! Login to your SMath Studio Forum forum account. New Registrations are disabled.

Offline alyles  
#2 Posted : 08 June 2021 15:23:27(UTC)
alyles


Rank: Advanced Member

Groups: Registered
Joined: 23/03/2016(UTC)
Posts: 275
United States

Was thanked: 91 time(s) in 57 post(s)
Made some changes to your file

1.) In the function definition of Sq() I used the vectorize function on the Raz()^2. This performs element-wise operations on x, y rather than matrix operations.
2.) In your programming loop. I assigned Sq:=Sq() then changed the conditional to Sq < Sqmin. This was because later you have Sqmin:=Sq, but never defined the value Sq.
01.sm (32kb) downloaded 13 time(s).
Feel free to join the SMath Studio Users Discord Channel: https://discord.gg/PayZpJW
Offline anatolsen  
#3 Posted : 09 June 2021 13:46:18(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear Jean Giraud
What's it?
Frechet Forum.sm?
You weren't answering my question, you were answering someone else's.
Offline overlord  
#4 Posted : 09 June 2021 13:57:55(UTC)
overlord


Rank: Advanced Member

Groups: Registered
Joined: 23/07/2013(UTC)
Posts: 1,124
Turkey

Was thanked: 506 time(s) in 337 post(s)
Originally Posted by: anatolsen Go to Quoted Post
Colleagues.
I have a problem when using it. And I'm not very good at math.
I need to with the sum of three normal distributions.
I have determined the approximating function of 9 parameters, but the iteration of the parameter values to find the minimum of the smallest sum of squares does not go. The program writes that my expression is set incorrectly.
Meaning - Sq(h1,h2, h3, µ1,µ2,µ3,σ1,σ2,σ3)
Could you tell me what I did wrong?
Solution file (01.sm) I attach - 01.sm (32kb) downloaded 17 time(s)..
P.S. I'm sorry, I don't speak English very well

I guess you are trying to multiply matrix elements with itselfs by squaring it.
You can't do that mathematically, unless you vectorize it.
2021-06-09_13-55.png
Here is your revised file.

Regards

01_rev.sm (31kb) downloaded 8 time(s).

2021-06-09_13-55_1.png
Offline anatolsen  
#5 Posted : 09 June 2021 14:38:18(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear alles,
Thank you for correcting my mistake.
But the desired result still did not work.
There, for some reason, the search of options stopped at the value of hh1 = 0.01, respectively, and hh2 = 001, hh3 = 0.01. That is, h1, h2, and h3 were only checked when the value =1.
Whereas I expected that the iteration of options should reach hh1 = hh2 = hh3 = 1, respectively h1 = h2 = h3 = 100.
For other parameters, it also seems that the search for options has not reached the end.
That is why the resulting graph shows a complete mismatch between the original data and the approximating formula.
I don't have an error in the If statement?
There I put continue in the else field.
eng-min.sm (33kb) downloaded 13 time(s).

Offline anatolsen  
#6 Posted : 09 June 2021 14:45:19(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Thank you, I have already been pointed out to this error.
I was trying to sum the squares of the error difference.
See the eng-min file in the last replica.

Offline overlord  
#7 Posted : 09 June 2021 16:20:52(UTC)
overlord


Rank: Advanced Member

Groups: Registered
Joined: 23/07/2013(UTC)
Posts: 1,124
Turkey

Was thanked: 506 time(s) in 337 post(s)
Originally Posted by: anatolsen Go to Quoted Post
Thank you, I have already been pointed out to this error.
I was trying to sum the squares of the error difference.
See the eng-min file in the last replica.

Your iterations are ill and wrong if this is not intended. Let me explain.
When sigma.3 reach 150, sigma.2 becomes 2. But sigma.3 doesn't start from 1 after it.
It has to be reseted to 1 but not on your algorithm.
Using a for loop is much more sense here.
But beware, when all ranges multiplied we reach 1.35×10¹⁹.
This means your algorithm is trying to calculate 1.35×10¹⁹ times.
Even 1×10⁶ takes 30 second on my pc. That algorithm will take very long.
I think we can't see the result of your problem in a human life time.
You can find worksheet below.

Regards.

eng-min_rev.sm (31kb) downloaded 11 time(s).

Edited by user 09 June 2021 17:26:25(UTC)  | Reason: Not specified

Offline mkraska  
#8 Posted : 09 June 2021 16:47:43(UTC)
mkraska


Rank: Advanced Member

Groups: Registered
Joined: 15/04/2012(UTC)
Posts: 1,980
Germany

Was thanked: 1120 time(s) in 718 post(s)
fit.png
fit.sm (137kb) downloaded 18 time(s).
Martin Kraska

Pre-configured portable distribution of SMath Studio: https://smath.com/wiki/SMath_with_Plugins.ashx
Offline Razonar  
#9 Posted : 09 June 2021 17:10:39(UTC)
Razonar


Rank: Advanced Member

Groups: Registered
Joined: 28/08/2014(UTC)
Posts: 1,356
Uruguay

Was thanked: 815 time(s) in 516 post(s)
Hi. Some notes attached.

eng-min.sm (45kb) downloaded 19 time(s).
eng-min.pdf (383kb) downloaded 18 time(s).

Best regards.
Alvaro.

Edited by user 09 June 2021 17:12:48(UTC)  | Reason: Oh, I don't see the Martin post, it looks better than this one.

Offline anatolsen  
#10 Posted : 10 June 2021 09:20:29(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Deard overlord

Thanks.
This subtlety that sigma does not reset, I did not understand, but the result began to assume something similar.
I'll look at your edits carefully.

Offline anatolsen  
#11 Posted : 10 June 2021 09:25:33(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Thank you, Martin
Your math is too complicated for me, and I'm not very good with matrices and vectors, but I'll try to figure it out.
Offline anatolsen  
#12 Posted : 10 June 2021 09:34:10(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear, Jean Giraud.
You certainly offer an approximation, but unfortunately not the one I need.
Thanks.

Offline anatolsen  
#13 Posted : 14 June 2021 00:03:16(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear overlord
I decided to take a simpler case for now, when we approximate the sum of two distributions.
With some tweaks, the time limit can be circumvented. Although I haven't tried the Fit function yet. Perhaps it will give you a gain in time.
Can you tell me how to limit the accuracy of the calculations? There Sq is calculated with an accuracy of 15 decimal places, while 9 digits are quite enough for me. This would give you a gain in time.
The approximation is quite satisfactory. The defined integral differs from the original value by only 1.5%. Given that the raw data is quite rough, there is actually a histogram, this is quite acceptable.
But I still need to calculate the Pearson correlation coefficient, and I'm faced with the fact that the reference book does not specify how to find the sum of the elements of any vector. Can you tell me what you can do in this case?

Pirson-0.sm (15kb) downloaded 10 time(s).

Regards.


Offline overlord  
#14 Posted : 14 June 2021 02:19:06(UTC)
overlord


Rank: Advanced Member

Groups: Registered
Joined: 23/07/2013(UTC)
Posts: 1,124
Turkey

Was thanked: 506 time(s) in 337 post(s)
Originally Posted by: anatolsen Go to Quoted Post
But I still need to calculate the Pearson correlation coefficient, and I'm faced with the fact that the reference book does not specify how to find the sum of the elements of any vector. Can you tell me what you can do in this case?

Regards.

Below you can find how to calculate Pearson correlation coefficient.
To sum all elements of a vector, you can use sum(1).
To operate piecewise you can use sum(4).

You can also use for-loop and while() to calculate something in a vector/matrix. Not demonstrated here.

Regards

pearson.sm (9kb) downloaded 8 time(s).

pearson.png

Edited by user 14 June 2021 13:38:28(UTC)  | Reason: Not specified

Offline anatolsen  
#15 Posted : 14 June 2021 13:55:57(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear overlord.

Thank you very much.
All problems, as I see, are solved.

How to use the loop for calculations (for loop and while), I, in general, imagined, but I did not guess to look in the sum functions.
The fact is that I use a non-English version, and the help in it does not work very well yet.
Thanks.

There is only one small problem left – is it possible to reduce the accuracy of calculating Sq, respectively Sqmin? As was seen in the calculations, they are calculated with an accuracy of up to 15 decimal places.
And this obviously affects the calculation time. Whereas 9 decimal places, maybe even 8 digits, would be enough for me.

As you rightly pointed out – the calculation time becomes huge.
But I now got out of the situation so that first I manually select an approximate solution (about 5-10 minutes). Then I set the iteration iteration step to 2 or 3 with a wide limit.
The calculation takes 40 minutes. After setting the iteration step to 1, simultaneously reducing the range of values. The calculation takes about 10 minutes.
Thus, the required accuracy is obtained – especially since the integration after the calculation confirms the accuracy of the calculation.
I'll also calculate the Pearson coefficient – in general, it will be fine, you can't ask for anything better. For one set of source data, it takes about 1 hour.
If we have to count three normal distributions instead of two, the time will increase, of course, but I think it will remain within reasonable limits.
Although, if it were possible to reduce the accuracy of the calculations, the calculation would become even easier.

Regards.

Offline anatolsen  
#16 Posted : 14 June 2021 14:04:03(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Dear Martin Kraska

I tried to compose the solution as you suggested.
But for some reason, the program gives an error.

In addition, I have a question – you suggested a solution when the approximation is based on the sum of two normal distributions.
And in my source data, this does not always happen. It is often necessary to look for an approximation by the sum of the three distributions.
Will this increase the calculation time to inconvenient values?

Fit-1.sm (11kb) downloaded 7 time(s).

Regards.
Offline anatolsen  
#17 Posted : 14 June 2021 14:14:28(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Originally Posted by: Jean Giraud Go to Quoted Post

There are 3 Pearson coefficients.
From there, up you go.



Thanks

Your solution record is certainly more compact.
However, it will be more difficult for me to modify it for my other cases compared to the Overlord entry.

Regards.
Offline overlord  
#18 Posted : 14 June 2021 14:47:14(UTC)
overlord


Rank: Advanced Member

Groups: Registered
Joined: 23/07/2013(UTC)
Posts: 1,124
Turkey

Was thanked: 506 time(s) in 337 post(s)
Originally Posted by: anatolsen Go to Quoted Post
There is only one small problem left – is it possible to reduce the accuracy of calculating Sq, respectively Sqmin?


I have an idea for it, for your original function. Don't know if it is applicable or I have the right idea or skills.

I will try to implement when I am available.

Regards

Offline anatolsen  
#19 Posted : 14 June 2021 17:49:53(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Originally Posted by: overlord Go to Quoted Post

I have an idea for it, for your original function. Don't know if it is applicable or I have the right idea or skills.

I will try to implement when I am available.

Regards



Thanks.
Here is the entire calculation from start to finish as it currently stands

Pirsonend.sm (37kb) downloaded 13 time(s).

Regards
Offline anatolsen  
#20 Posted : 14 June 2021 18:16:14(UTC)
anatolsen


Rank: Member

Groups: Registered
Joined: 08/06/2021(UTC)
Posts: 16
Location: Караганда

Originally Posted by: Jean Giraud Go to Quoted Post

Problem is that we don't know what you are looking for.
Are you looking for the best fit to your data set ???
If so, minute job.
Please, don't hesitate ... Jean



So I wrote down what I needed at the very beginning.
There is a data set (two rows of numbers).
Basically, they should be random sets.
However, in reality, two or even three different data sets are mixed there.
I need to separate them - to identify those normal distributions that correspond to pure data sets (not mixed data sets).
In the previous replica of for overlord, I posted the full calculation method.

Now only one question remains. Iterating through 9 parameters requires a lot of time. And the calculation is carried out with an accuracy that is superfluous for my purposes.
Sq and Sqmin are calculated with an accuracy of 15 decimal places. And I would have more than enough 8 or 9 digits after the decimal point.
This would significantly reduce the calculation time.
And here is the Pearson correlation coefficient I have 0.9999, whereas I would have more than enough 0.9

On the other hand, perhaps the Pearson coefficient should not be worsened.
Here in that calculation (in the previous replica) I had an error after integrating 1.5%, and I slightly knocked down the parameters, and it increased to 2.4%.
And this is more important than the Pearson coefficient. I would not count it at all, but I will definitely be asked a question about this coefficient.

There the definite integral is taken WW(x). And it is equal to 0.9658.
And initially I have a histogram g2(x), and if you take the integral of it, it will be equal to 0.99.
And in my case, this is more important than the Pearson coefficient.
Users browsing this topic
Similar Topics
Approximate solution of equation system? (Questions)
by Ushwood 14/08/2018 13:43:15(UTC)
2 Pages12>
Forum Jump  
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.