Glitches in RRD data

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Glitches in RRD data

Andrew Thoms

Hi,

 

I have been using owfs at home for a while now I am most impressed with how easy it has been to setup and use. I wrote a pic interface to 1-wire for a uni project a few years ago and I had the sensors lying around doing nothing so I installed owfs and I haven’t looked back…

 

I recently purchased a WRT54G and proceeded to add two serial ports to it before I even thought of using it…

 

I had some troubles getting it to discover the one wire network (I think it might be a problem with the second serial port I haven’t bothered to check it yet though).

I sidestepped the problem by plugging the 1-wire into my redhat 9 box with owserver and setting up the WRT54G to access the bus remotely.

This has been running for a few weeks now without any problems… I can check the rrd graphs from work, and dream of being at home in the swimming pool (well maybe not when it is 14ºc…)

 

Yesterday it started to have glitches in the temperature readings; all the sensors seem to have coincidental spikes or dips in their values at random intervals.

 

I have attached some of the graphs from this morning.

The weekly data shows how it has been smooth all week and now it looks all hairy.

(even though I have been using it for a few weeks, the data is only there since the last reboot of the WRT54G)

 

 

Has anyone seen a similar thing, or have any ideas where to start looking for the problem…

 

Andrew Thoms

Electrical Engineer

Silex Systems Limited
Lucas Heights Science & Technology Centre
Building 64, New Illawarra Road
LUCAS HEIGHTS NSW 2234
AUSTRALIA

Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
<a href="file:///\\www.silex.com.au\">www.silex.com.au
 

This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.

 


all_temperature.png (15K) Download Attachment
week_temperature.png (19K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Glitches in RRD data

Vadim Tkachenko
Andrew Thoms wrote:


Yesterday it started to have glitches in the temperature readings; all the sensors seem to have coincidental spikes or dips in their values at random intervals.

 

I have attached some of the graphs from this morning.

The weekly data shows how it has been smooth all week and now it looks all hairy.

(even though I have been using it for a few weeks, the data is only there since the last reboot of the WRT54G)

 

 

Has anyone seen a similar thing, or have any ideas where to start looking for the problem…

The graphs are too small to really see anything (even though I had to approve the posting - it was over the limit) - could you render them way bigger (say 1024x768) and post them somewhere, then send the URL here? This may give some additional ideas...

Anyway, if my eyes don't betray me, I see two kinds of anomalies - positive and negative shifts. It seems that the positive shifts affect all the sensors - check if your power is OK, and I've seen this kind of behavior if you're polling them too fast.

About the negative spike... I've had a one like that - Dec 28 2003. Attached is the screenshot - the cause was the power loss to the sensor network, while the computer lived on, powered by the UPS.

Check your power. 

Andrew Thoms

--vt


temp-day.png (24K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Glitches in RRD data

Christian Magnusson
In reply to this post by Andrew Thoms

Isn't one of the serial ports allocated to the kernel-console by
default? I'm not sure about it when using the WRT54G router, but
it's worth making sure it's not the case.

BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?

Could you please login to your router and make a simple 'ps'.
Does any of your owserver or owfs processes take more then
1000kb memory?  Also check 'dmesg' if there are any
'out of memory' messages.
This could perhaps be one reason to why it's reading wrong
temperatures. I have recently noticed there are some memory
leak in owserver, at least when I read the router-sensors from
a another server.

The spikes you get in the graphs might also be there because of
major read-problems on the 1-wire bus... Perhaps water on the
pool-sensor connectors?

/Christian



On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:

> Hi,
>
>  
>
> I have been using owfs at home for a while now I am most impressed
> with how easy it has been to setup and use. I wrote a pic interface to
> 1-wire for a uni project a few years ago and I had the sensors lying
> around doing nothing so I installed owfs and I haven’t looked back…
>
>  
>
> I recently purchased a WRT54G and proceeded to add two serial ports to
> it before I even thought of using it…
>
>  
>
> I had some troubles getting it to discover the one wire network (I
> think it might be a problem with the second serial port I haven’t
> bothered to check it yet though).
>
> I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> with owserver and setting up the WRT54G to access the bus remotely.
>
> This has been running for a few weeks now without any problems… I can
> check the rrd graphs from work, and dream of being at home in the
> swimming pool (well maybe not when it is 14ºc…)
>
>  
>
> Yesterday it started to have glitches in the temperature readings; all
> the sensors seem to have coincidental spikes or dips in their values
> at random intervals.
>
>  
>
> I have attached some of the graphs from this morning.
>
> The weekly data shows how it has been smooth all week and now it looks
> all hairy.
>
> (even though I have been using it for a few weeks, the data is only
> there since the last reboot of the WRT54G)
>
>  
>
>  
>
> Has anyone seen a similar thing, or have any ideas where to start
> looking for the problem…
>
>  
>
> Andrew Thoms
>
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be
> subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> must not read, use, disseminate, distribute or copy this email or its
>   attachments. If you have received this in error, please inform us
> immediately by return email, facsimile or by telephone and delete this
>                                 email.
>
>  
>
>
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r 
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

RE: Glitches in RRD data

Andrew Thoms
In reply to this post by Andrew Thoms
Hi Christian,

I am using the first flashable image you put online...
owfs_wrt54g-firmware-0.1

The first serial port (/dev/tts/0) is set aside as the console, I can plug that into a terminal and get the normal kernel output at boot time....

I was plugging into the second port for 1-wire... I just have been too lazy to debug it so far. I haven't even plugged a terminal into it to check yet... most likely a pin 2,3 swapped problem or something simple like that.

I am not running the owserver process on the wrt54g, but here is the list of ow related things that are running...

ewrt ~# ps
  PID  Uid     VmSize Stat Command
   55 root        168 S   httpd
  120 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
  136 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
  137 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
16070 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
 4182 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
 4183 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
 4206 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
25005 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
25021 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
25022 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
25025 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
22827 root        188 S   temploggerd -c /opt/temploggerd.conf -P
22832 root        168 S   httpd2 -p 8004 -h /var/www

Strangely enough, when I went to make higher res graphs last night, the problems have seemed to fix themselves... there is just under 1 day of "hairy" data and now it is smooth again... if it does it again, I will bother then... it's a bit too hard to debug when it has disappeared...

I have noticed that if I am ssh'd into the router, and I view one of the temploggerd web pages, it sometimes (about 50% of the time) kicks me off the ssh connection with a simple "Connection to router closed." Have you seen the same thing? More likely to be a dropbear thing that templogger related I guess...

The pool sensors are encapsulated in 6mm stainless tubes about 1/2 a foot long filled with epoxy. These are my second generation as the first ones failed in less than a week. The salt water of the pool seeped into the sensor legs and one of them became "sacrificial". The second generation sensors have been installed for about six months with no problem at all.

There are some out of memory messages in /var/log/messages, at first glance, these don't appear to correlate to the spikes in the ow data... here is a sample of the error messages...

Jun 13 07:18:35 Router user.err kernel: Out of Memory: Killed process 28187 (process_monitor).
Jun 13 07:30:06 Router user.err kernel: Out of Memory: Killed process 29146 (process_monitor).
Jun 13 07:31:17 Router syslog.info -- MARK --
...
Jun 13 14:08:00 Router cron.info cron[2116]: (root) CMD (/sbin/check_ps)
Jun 13 14:08:53 Router user.err kernel: Out of Memory: Killed process 1632 (process_monitor).
Jun 13 14:20:02 Router user.err kernel: Out of Memory: Killed process 2156 (process_monitor).
Jun 13 14:29:36 Router user.err kernel: Out of Memory: Killed process 2284 (process_monitor).
Jun 13 14:49:59 Router user.err kernel: Out of Memory: Killed process 2438 (process_monitor).
Jun 13 15:00:04 Router user.err kernel: Out of Memory: Killed process 2702 (process_monitor).
Jun 13 15:09:51 Router user.err kernel: Out of Memory: Killed process 2878 (process_monitor).
Jun 13 15:10:00 Router cron.info cron[2980]: (root) CMD (/sbin/check_ps)


Thanks for all the effort you put into owfs. It is an excellent project and has saved me massive amounts of time.
Andrew
Electrical Engineer

Silex Systems Limited
Lucas Heights Science & Technology Centre
Building 64, New Illawarra Road
LUCAS HEIGHTS NSW 2234
AUSTRALIA

Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
www.silex.com.au
 

This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.



-----Original Message-----
From: Christian Magnusson [mailto:[hidden email]]
Sent: Tuesday, 14 June 2005 5:22 PM
To: owfs-developers
Cc: Andrew Thoms
Subject: Re: [Owfs-developers] Glitches in RRD data


Isn't one of the serial ports allocated to the kernel-console by
default? I'm not sure about it when using the WRT54G router, but
it's worth making sure it's not the case.

BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?

Could you please login to your router and make a simple 'ps'.
Does any of your owserver or owfs processes take more then
1000kb memory?  Also check 'dmesg' if there are any
'out of memory' messages.
This could perhaps be one reason to why it's reading wrong
temperatures. I have recently noticed there are some memory
leak in owserver, at least when I read the router-sensors from
a another server.

The spikes you get in the graphs might also be there because of
major read-problems on the 1-wire bus... Perhaps water on the
pool-sensor connectors?

/Christian



On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:

> Hi,
>
>  
>
> I have been using owfs at home for a while now I am most impressed
> with how easy it has been to setup and use. I wrote a pic interface to
> 1-wire for a uni project a few years ago and I had the sensors lying
> around doing nothing so I installed owfs and I haven't looked back...
>
>  
>
> I recently purchased a WRT54G and proceeded to add two serial ports to
> it before I even thought of using it...
>
>  
>
> I had some troubles getting it to discover the one wire network (I
> think it might be a problem with the second serial port I haven't
> bothered to check it yet though).
>
> I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> with owserver and setting up the WRT54G to access the bus remotely.
>
> This has been running for a few weeks now without any problems... I can
> check the rrd graphs from work, and dream of being at home in the
> swimming pool (well maybe not when it is 14ºc...)
>
>  
>
> Yesterday it started to have glitches in the temperature readings; all
> the sensors seem to have coincidental spikes or dips in their values
> at random intervals.
>
>  
>
> I have attached some of the graphs from this morning.
>
> The weekly data shows how it has been smooth all week and now it looks
> all hairy.
>
> (even though I have been using it for a few weeks, the data is only
> there since the last reboot of the WRT54G)
>
>  
>
>  
>
> Has anyone seen a similar thing, or have any ideas where to start
> looking for the problem...
>
>  
>
> Andrew Thoms
>
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be
> subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> must not read, use, disseminate, distribute or copy this email or its
>   attachments. If you have received this in error, please inform us
> immediately by return email, facsimile or by telephone and delete this
>                                 email.
>
>  
>
>
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&op=click">http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

RE: Glitches in RRD data

Christian Magnusson

You definitely need to update owfs to the latest version since you
are running out of memory in the router.

If owfs/owhttpd/owserver is used with the "-s" flag, ServerDir() will
be called and there will be a memory leak.

Some process in the router (process_monitor), will kill
the largest process if it's running out of ram, and in this case it's
certainly the main or child-process to owfs. If the process is killed
I guess temploggerd will be pretty confused when reading from /var/1wire
and the result will probably be wrong.

I have uploaded a new compiled version of the firmware (version 1.1b)
and this doesn't seem to have any memory leaks.


Have you any ideas how to save the rrd-file after upgrading the router
with new firmware, or if it's restarted?  The file is too big to save
in /opt and it's not fun to loose all data when it's several weeks
with saved graphs.
I guess the only way is to save it on some other computer frequently and
then restore the rrd-file after rebooting the router. Any ideas?

/Christian



On Thu, 2005-06-16 at 08:58 +1000, Andrew Thoms wrote:

> Hi Christian,
>
> I am using the first flashable image you put online...
> owfs_wrt54g-firmware-0.1
>
> The first serial port (/dev/tts/0) is set aside as the console, I can plug that into a terminal and get the normal kernel output at boot time....
>
> I was plugging into the second port for 1-wire... I just have been too lazy to debug it so far. I haven't even plugged a terminal into it to check yet... most likely a pin 2,3 swapped problem or something simple like that.
>
> I am not running the owserver process on the wrt54g, but here is the list of ow related things that are running...
>
> ewrt ~# ps
>   PID  Uid     VmSize Stat Command
>    55 root        168 S   httpd
>   120 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>   136 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>   137 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 16070 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>  4182 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
>  4183 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
>  4206 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> 25005 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25021 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25022 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25025 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 22827 root        188 S   temploggerd -c /opt/temploggerd.conf -P
> 22832 root        168 S   httpd2 -p 8004 -h /var/www
>
> Strangely enough, when I went to make higher res graphs last night, the problems have seemed to fix themselves... there is just under 1 day of "hairy" data and now it is smooth again... if it does it again, I will bother then... it's a bit too hard to debug when it has disappeared...
>
> I have noticed that if I am ssh'd into the router, and I view one of the temploggerd web pages, it sometimes (about 50% of the time) kicks me off the ssh connection with a simple "Connection to router closed." Have you seen the same thing? More likely to be a dropbear thing that templogger related I guess...
>
> The pool sensors are encapsulated in 6mm stainless tubes about 1/2 a foot long filled with epoxy. These are my second generation as the first ones failed in less than a week. The salt water of the pool seeped into the sensor legs and one of them became "sacrificial". The second generation sensors have been installed for about six months with no problem at all.
>
> There are some out of memory messages in /var/log/messages, at first glance, these don't appear to correlate to the spikes in the ow data... here is a sample of the error messages...
>
> Jun 13 07:18:35 Router user.err kernel: Out of Memory: Killed process 28187 (process_monitor).
> Jun 13 07:30:06 Router user.err kernel: Out of Memory: Killed process 29146 (process_monitor).
> Jun 13 07:31:17 Router syslog.info -- MARK --
> ...
> Jun 13 14:08:00 Router cron.info cron[2116]: (root) CMD (/sbin/check_ps)
> Jun 13 14:08:53 Router user.err kernel: Out of Memory: Killed process 1632 (process_monitor).
> Jun 13 14:20:02 Router user.err kernel: Out of Memory: Killed process 2156 (process_monitor).
> Jun 13 14:29:36 Router user.err kernel: Out of Memory: Killed process 2284 (process_monitor).
> Jun 13 14:49:59 Router user.err kernel: Out of Memory: Killed process 2438 (process_monitor).
> Jun 13 15:00:04 Router user.err kernel: Out of Memory: Killed process 2702 (process_monitor).
> Jun 13 15:09:51 Router user.err kernel: Out of Memory: Killed process 2878 (process_monitor).
> Jun 13 15:10:00 Router cron.info cron[2980]: (root) CMD (/sbin/check_ps)
>
>
> Thanks for all the effort you put into owfs. It is an excellent project and has saved me massive amounts of time.
> Andrew
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
>
>
>
> -----Original Message-----
> From: Christian Magnusson [mailto:[hidden email]]
> Sent: Tuesday, 14 June 2005 5:22 PM
> To: owfs-developers
> Cc: Andrew Thoms
> Subject: Re: [Owfs-developers] Glitches in RRD data
>
>
> Isn't one of the serial ports allocated to the kernel-console by
> default? I'm not sure about it when using the WRT54G router, but
> it's worth making sure it's not the case.
>
> BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?
>
> Could you please login to your router and make a simple 'ps'.
> Does any of your owserver or owfs processes take more then
> 1000kb memory?  Also check 'dmesg' if there are any
> 'out of memory' messages.
> This could perhaps be one reason to why it's reading wrong
> temperatures. I have recently noticed there are some memory
> leak in owserver, at least when I read the router-sensors from
> a another server.
>
> The spikes you get in the graphs might also be there because of
> major read-problems on the 1-wire bus... Perhaps water on the
> pool-sensor connectors?
>
> /Christian
>
>
>
> On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:
> > Hi,
> >
> >  
> >
> > I have been using owfs at home for a while now I am most impressed
> > with how easy it has been to setup and use. I wrote a pic interface to
> > 1-wire for a uni project a few years ago and I had the sensors lying
> > around doing nothing so I installed owfs and I haven't looked back...
> >
> >  
> >
> > I recently purchased a WRT54G and proceeded to add two serial ports to
> > it before I even thought of using it...
> >
> >  
> >
> > I had some troubles getting it to discover the one wire network (I
> > think it might be a problem with the second serial port I haven't
> > bothered to check it yet though).
> >
> > I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> > with owserver and setting up the WRT54G to access the bus remotely.
> >
> > This has been running for a few weeks now without any problems... I can
> > check the rrd graphs from work, and dream of being at home in the
> > swimming pool (well maybe not when it is 14ºc...)
> >
> >  
> >
> > Yesterday it started to have glitches in the temperature readings; all
> > the sensors seem to have coincidental spikes or dips in their values
> > at random intervals.
> >
> >  
> >
> > I have attached some of the graphs from this morning.
> >
> > The weekly data shows how it has been smooth all week and now it looks
> > all hairy.
> >
> > (even though I have been using it for a few weeks, the data is only
> > there since the last reboot of the WRT54G)
> >
> >  
> >
> >  
> >
> > Has anyone seen a similar thing, or have any ideas where to start
> > looking for the problem...
> >
> >  
> >
> > Andrew Thoms
> >
> > Electrical Engineer
> >
> > Silex Systems Limited
> > Lucas Heights Science & Technology Centre
> > Building 64, New Illawarra Road
> > LUCAS HEIGHTS NSW 2234
> > AUSTRALIA
> >
> > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > www.silex.com.au
> >  
> >
> > This email contains information which is CONFIDENTIAL and that may be
> > subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> > must not read, use, disseminate, distribute or copy this email or its
> >   attachments. If you have received this in error, please inform us
> > immediately by return email, facsimile or by telephone and delete this
> >                                 email.
> >
> >  
> >
> >
> --
> Christian Magnusson <[hidden email]>
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk">http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk
> _______________________________________________
> Owfs-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/owfs-developers
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&op=click">http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

RE: Glitches in RRD data

Andrew Thoms
In reply to this post by Andrew Thoms
Hi,

I figure the easiest is to do a scp of the rrd file just before and after the upgrade. That works fine for a planned reboot, but is no good in a power failure... A 12v gel cell battery could probably provide a crude ups for the wrt54g. Has anyone seen any projects to add memory to the wrt54g? That way we could increase the /opt file system and store the rrd file there.

Andrew Thoms

Electrical Engineer

Silex Systems Limited
Lucas Heights Science & Technology Centre
Building 64, New Illawarra Road
LUCAS HEIGHTS NSW 2234
AUSTRALIA

Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
www.silex.com.au
 

This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.



-----Original Message-----
From: Christian Magnusson [mailto:[hidden email]]
Sent: Thursday, 16 June 2005 6:20 PM
To: owfs-developers
Cc: Andrew Thoms
Subject: RE: [Owfs-developers] Glitches in RRD data


You definitely need to update owfs to the latest version since you
are running out of memory in the router.

If owfs/owhttpd/owserver is used with the "-s" flag, ServerDir() will
be called and there will be a memory leak.

Some process in the router (process_monitor), will kill
the largest process if it's running out of ram, and in this case it's
certainly the main or child-process to owfs. If the process is killed
I guess temploggerd will be pretty confused when reading from /var/1wire
and the result will probably be wrong.

I have uploaded a new compiled version of the firmware (version 1.1b)
and this doesn't seem to have any memory leaks.


Have you any ideas how to save the rrd-file after upgrading the router
with new firmware, or if it's restarted?  The file is too big to save
in /opt and it's not fun to loose all data when it's several weeks
with saved graphs.
I guess the only way is to save it on some other computer frequently and
then restore the rrd-file after rebooting the router. Any ideas?

/Christian



On Thu, 2005-06-16 at 08:58 +1000, Andrew Thoms wrote:

> Hi Christian,
>
> I am using the first flashable image you put online...
> owfs_wrt54g-firmware-0.1
>
> The first serial port (/dev/tts/0) is set aside as the console, I can plug that into a terminal and get the normal kernel output at boot time....
>
> I was plugging into the second port for 1-wire... I just have been too lazy to debug it so far. I haven't even plugged a terminal into it to check yet... most likely a pin 2,3 swapped problem or something simple like that.
>
> I am not running the owserver process on the wrt54g, but here is the list of ow related things that are running...
>
> ewrt ~# ps
>   PID  Uid     VmSize Stat Command
>    55 root        168 S   httpd
>   120 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>   136 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>   137 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 16070 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
>  4182 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
>  4183 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
>  4206 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> 25005 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25021 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25022 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 25025 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> 22827 root        188 S   temploggerd -c /opt/temploggerd.conf -P
> 22832 root        168 S   httpd2 -p 8004 -h /var/www
>
> Strangely enough, when I went to make higher res graphs last night, the problems have seemed to fix themselves... there is just under 1 day of "hairy" data and now it is smooth again... if it does it again, I will bother then... it's a bit too hard to debug when it has disappeared...
>
> I have noticed that if I am ssh'd into the router, and I view one of the temploggerd web pages, it sometimes (about 50% of the time) kicks me off the ssh connection with a simple "Connection to router closed." Have you seen the same thing? More likely to be a dropbear thing that templogger related I guess...
>
> The pool sensors are encapsulated in 6mm stainless tubes about 1/2 a foot long filled with epoxy. These are my second generation as the first ones failed in less than a week. The salt water of the pool seeped into the sensor legs and one of them became "sacrificial". The second generation sensors have been installed for about six months with no problem at all.
>
> There are some out of memory messages in /var/log/messages, at first glance, these don't appear to correlate to the spikes in the ow data... here is a sample of the error messages...
>
> Jun 13 07:18:35 Router user.err kernel: Out of Memory: Killed process 28187 (process_monitor).
> Jun 13 07:30:06 Router user.err kernel: Out of Memory: Killed process 29146 (process_monitor).
> Jun 13 07:31:17 Router syslog.info -- MARK --
> ...
> Jun 13 14:08:00 Router cron.info cron[2116]: (root) CMD (/sbin/check_ps)
> Jun 13 14:08:53 Router user.err kernel: Out of Memory: Killed process 1632 (process_monitor).
> Jun 13 14:20:02 Router user.err kernel: Out of Memory: Killed process 2156 (process_monitor).
> Jun 13 14:29:36 Router user.err kernel: Out of Memory: Killed process 2284 (process_monitor).
> Jun 13 14:49:59 Router user.err kernel: Out of Memory: Killed process 2438 (process_monitor).
> Jun 13 15:00:04 Router user.err kernel: Out of Memory: Killed process 2702 (process_monitor).
> Jun 13 15:09:51 Router user.err kernel: Out of Memory: Killed process 2878 (process_monitor).
> Jun 13 15:10:00 Router cron.info cron[2980]: (root) CMD (/sbin/check_ps)
>
>
> Thanks for all the effort you put into owfs. It is an excellent project and has saved me massive amounts of time.
> Andrew
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
>
>
>
> -----Original Message-----
> From: Christian Magnusson [mailto:[hidden email]]
> Sent: Tuesday, 14 June 2005 5:22 PM
> To: owfs-developers
> Cc: Andrew Thoms
> Subject: Re: [Owfs-developers] Glitches in RRD data
>
>
> Isn't one of the serial ports allocated to the kernel-console by
> default? I'm not sure about it when using the WRT54G router, but
> it's worth making sure it's not the case.
>
> BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?
>
> Could you please login to your router and make a simple 'ps'.
> Does any of your owserver or owfs processes take more then
> 1000kb memory?  Also check 'dmesg' if there are any
> 'out of memory' messages.
> This could perhaps be one reason to why it's reading wrong
> temperatures. I have recently noticed there are some memory
> leak in owserver, at least when I read the router-sensors from
> a another server.
>
> The spikes you get in the graphs might also be there because of
> major read-problems on the 1-wire bus... Perhaps water on the
> pool-sensor connectors?
>
> /Christian
>
>
>
> On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:
> > Hi,
> >
> >  
> >
> > I have been using owfs at home for a while now I am most impressed
> > with how easy it has been to setup and use. I wrote a pic interface to
> > 1-wire for a uni project a few years ago and I had the sensors lying
> > around doing nothing so I installed owfs and I haven't looked back...
> >
> >  
> >
> > I recently purchased a WRT54G and proceeded to add two serial ports to
> > it before I even thought of using it...
> >
> >  
> >
> > I had some troubles getting it to discover the one wire network (I
> > think it might be a problem with the second serial port I haven't
> > bothered to check it yet though).
> >
> > I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> > with owserver and setting up the WRT54G to access the bus remotely.
> >
> > This has been running for a few weeks now without any problems... I can
> > check the rrd graphs from work, and dream of being at home in the
> > swimming pool (well maybe not when it is 14ºc...)
> >
> >  
> >
> > Yesterday it started to have glitches in the temperature readings; all
> > the sensors seem to have coincidental spikes or dips in their values
> > at random intervals.
> >
> >  
> >
> > I have attached some of the graphs from this morning.
> >
> > The weekly data shows how it has been smooth all week and now it looks
> > all hairy.
> >
> > (even though I have been using it for a few weeks, the data is only
> > there since the last reboot of the WRT54G)
> >
> >  
> >
> >  
> >
> > Has anyone seen a similar thing, or have any ideas where to start
> > looking for the problem...
> >
> >  
> >
> > Andrew Thoms
> >
> > Electrical Engineer
> >
> > Silex Systems Limited
> > Lucas Heights Science & Technology Centre
> > Building 64, New Illawarra Road
> > LUCAS HEIGHTS NSW 2234
> > AUSTRALIA
> >
> > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > www.silex.com.au
> >  
> >
> > This email contains information which is CONFIDENTIAL and that may be
> > subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> > must not read, use, disseminate, distribute or copy this email or its
> >   attachments. If you have received this in error, please inform us
> > immediately by return email, facsimile or by telephone and delete this
> >                                 email.
> >
> >  
> >
> >
> --
> Christian Magnusson <[hidden email]>
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk">http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk
> _______________________________________________
> Owfs-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/owfs-developers
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&op=click">http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

RE: Glitches in RRD data

Christian Magnusson

All WRT54G models have 4Mb flash, but WRT54GS have 8Mb flash. If you
want to upgrade the router with more space in /opt, then it's probably
easier to buy a GS-version. (and recompile ewrt for the GS version)

This will probably not solve the upgrade problem though.... My guess
is that all flash is erased when upgrading to a new firmware, and then
your rrd-file will be deleted anyway.

I don't know how long life-time it is on the flash either, so it's
perhaps not good to permanently locate the rrd-file in /opt and then
read/write to it every minute.... Perhaps it works for some years
before the flash is broken, but I don't know for sure...

/Christian



On Thu, 2005-06-16 at 18:26 +1000, Andrew Thoms wrote:

> Hi,
>
> I figure the easiest is to do a scp of the rrd file just before and after the upgrade. That works fine for a planned reboot, but is no good in a power failure... A 12v gel cell battery could probably provide a crude ups for the wrt54g. Has anyone seen any projects to add memory to the wrt54g? That way we could increase the /opt file system and store the rrd file there.
>
> Andrew Thoms
>
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
>
>
>
> -----Original Message-----
> From: Christian Magnusson [mailto:[hidden email]]
> Sent: Thursday, 16 June 2005 6:20 PM
> To: owfs-developers
> Cc: Andrew Thoms
> Subject: RE: [Owfs-developers] Glitches in RRD data
>
>
> You definitely need to update owfs to the latest version since you
> are running out of memory in the router.
>
> If owfs/owhttpd/owserver is used with the "-s" flag, ServerDir() will
> be called and there will be a memory leak.
>
> Some process in the router (process_monitor), will kill
> the largest process if it's running out of ram, and in this case it's
> certainly the main or child-process to owfs. If the process is killed
> I guess temploggerd will be pretty confused when reading from /var/1wire
> and the result will probably be wrong.
>
> I have uploaded a new compiled version of the firmware (version 1.1b)
> and this doesn't seem to have any memory leaks.
>
>
> Have you any ideas how to save the rrd-file after upgrading the router
> with new firmware, or if it's restarted?  The file is too big to save
> in /opt and it's not fun to loose all data when it's several weeks
> with saved graphs.
> I guess the only way is to save it on some other computer frequently and
> then restore the rrd-file after rebooting the router. Any ideas?
>
> /Christian
>
>
>
> On Thu, 2005-06-16 at 08:58 +1000, Andrew Thoms wrote:
> > Hi Christian,
> >
> > I am using the first flashable image you put online...
> > owfs_wrt54g-firmware-0.1
> >
> > The first serial port (/dev/tts/0) is set aside as the console, I can plug that into a terminal and get the normal kernel output at boot time....
> >
> > I was plugging into the second port for 1-wire... I just have been too lazy to debug it so far. I haven't even plugged a terminal into it to check yet... most likely a pin 2,3 swapped problem or something simple like that.
> >
> > I am not running the owserver process on the wrt54g, but here is the list of ow related things that are running...
> >
> > ewrt ~# ps
> >   PID  Uid     VmSize Stat Command
> >    55 root        168 S   httpd
> >   120 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >   136 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >   137 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 16070 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >  4182 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> >  4183 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> >  4206 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> > 25005 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25021 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25022 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25025 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 22827 root        188 S   temploggerd -c /opt/temploggerd.conf -P
> > 22832 root        168 S   httpd2 -p 8004 -h /var/www
> >
> > Strangely enough, when I went to make higher res graphs last night, the problems have seemed to fix themselves... there is just under 1 day of "hairy" data and now it is smooth again... if it does it again, I will bother then... it's a bit too hard to debug when it has disappeared...
> >
> > I have noticed that if I am ssh'd into the router, and I view one of the temploggerd web pages, it sometimes (about 50% of the time) kicks me off the ssh connection with a simple "Connection to router closed." Have you seen the same thing? More likely to be a dropbear thing that templogger related I guess...
> >
> > The pool sensors are encapsulated in 6mm stainless tubes about 1/2 a foot long filled with epoxy. These are my second generation as the first ones failed in less than a week. The salt water of the pool seeped into the sensor legs and one of them became "sacrificial". The second generation sensors have been installed for about six months with no problem at all.
> >
> > There are some out of memory messages in /var/log/messages, at first glance, these don't appear to correlate to the spikes in the ow data... here is a sample of the error messages...
> >
> > Jun 13 07:18:35 Router user.err kernel: Out of Memory: Killed process 28187 (process_monitor).
> > Jun 13 07:30:06 Router user.err kernel: Out of Memory: Killed process 29146 (process_monitor).
> > Jun 13 07:31:17 Router syslog.info -- MARK --
> > ...
> > Jun 13 14:08:00 Router cron.info cron[2116]: (root) CMD (/sbin/check_ps)
> > Jun 13 14:08:53 Router user.err kernel: Out of Memory: Killed process 1632 (process_monitor).
> > Jun 13 14:20:02 Router user.err kernel: Out of Memory: Killed process 2156 (process_monitor).
> > Jun 13 14:29:36 Router user.err kernel: Out of Memory: Killed process 2284 (process_monitor).
> > Jun 13 14:49:59 Router user.err kernel: Out of Memory: Killed process 2438 (process_monitor).
> > Jun 13 15:00:04 Router user.err kernel: Out of Memory: Killed process 2702 (process_monitor).
> > Jun 13 15:09:51 Router user.err kernel: Out of Memory: Killed process 2878 (process_monitor).
> > Jun 13 15:10:00 Router cron.info cron[2980]: (root) CMD (/sbin/check_ps)
> >
> >
> > Thanks for all the effort you put into owfs. It is an excellent project and has saved me massive amounts of time.
> > Andrew
> > Electrical Engineer
> >
> > Silex Systems Limited
> > Lucas Heights Science & Technology Centre
> > Building 64, New Illawarra Road
> > LUCAS HEIGHTS NSW 2234
> > AUSTRALIA
> >
> > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > www.silex.com.au
> >  
> >
> > This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
> >
> >
> >
> > -----Original Message-----
> > From: Christian Magnusson [mailto:[hidden email]]
> > Sent: Tuesday, 14 June 2005 5:22 PM
> > To: owfs-developers
> > Cc: Andrew Thoms
> > Subject: Re: [Owfs-developers] Glitches in RRD data
> >
> >
> > Isn't one of the serial ports allocated to the kernel-console by
> > default? I'm not sure about it when using the WRT54G router, but
> > it's worth making sure it's not the case.
> >
> > BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?
> >
> > Could you please login to your router and make a simple 'ps'.
> > Does any of your owserver or owfs processes take more then
> > 1000kb memory?  Also check 'dmesg' if there are any
> > 'out of memory' messages.
> > This could perhaps be one reason to why it's reading wrong
> > temperatures. I have recently noticed there are some memory
> > leak in owserver, at least when I read the router-sensors from
> > a another server.
> >
> > The spikes you get in the graphs might also be there because of
> > major read-problems on the 1-wire bus... Perhaps water on the
> > pool-sensor connectors?
> >
> > /Christian
> >
> >
> >
> > On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:
> > > Hi,
> > >
> > >  
> > >
> > > I have been using owfs at home for a while now I am most impressed
> > > with how easy it has been to setup and use. I wrote a pic interface to
> > > 1-wire for a uni project a few years ago and I had the sensors lying
> > > around doing nothing so I installed owfs and I haven't looked back...
> > >
> > >  
> > >
> > > I recently purchased a WRT54G and proceeded to add two serial ports to
> > > it before I even thought of using it...
> > >
> > >  
> > >
> > > I had some troubles getting it to discover the one wire network (I
> > > think it might be a problem with the second serial port I haven't
> > > bothered to check it yet though).
> > >
> > > I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> > > with owserver and setting up the WRT54G to access the bus remotely.
> > >
> > > This has been running for a few weeks now without any problems... I can
> > > check the rrd graphs from work, and dream of being at home in the
> > > swimming pool (well maybe not when it is 14ºc...)
> > >
> > >  
> > >
> > > Yesterday it started to have glitches in the temperature readings; all
> > > the sensors seem to have coincidental spikes or dips in their values
> > > at random intervals.
> > >
> > >  
> > >
> > > I have attached some of the graphs from this morning.
> > >
> > > The weekly data shows how it has been smooth all week and now it looks
> > > all hairy.
> > >
> > > (even though I have been using it for a few weeks, the data is only
> > > there since the last reboot of the WRT54G)
> > >
> > >  
> > >
> > >  
> > >
> > > Has anyone seen a similar thing, or have any ideas where to start
> > > looking for the problem...
> > >
> > >  
> > >
> > > Andrew Thoms
> > >
> > > Electrical Engineer
> > >
> > > Silex Systems Limited
> > > Lucas Heights Science & Technology Centre
> > > Building 64, New Illawarra Road
> > > LUCAS HEIGHTS NSW 2234
> > > AUSTRALIA
> > >
> > > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > > www.silex.com.au
> > >  
> > >
> > > This email contains information which is CONFIDENTIAL and that may be
> > > subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> > > must not read, use, disseminate, distribute or copy this email or its
> > >   attachments. If you have received this in error, please inform us
> > > immediately by return email, facsimile or by telephone and delete this
> > >                                 email.
> > >
> > >  
> > >
> > >
> > --
> > Christian Magnusson <[hidden email]>
> >
> >
> >
> > -------------------------------------------------------
> > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> > from IBM. Find simple to follow Roadmaps, straightforward articles,
> > informative Webcasts and more! Get everything you need to get up to
> > speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk">http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk
> > _______________________________________________
> > Owfs-developers mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/owfs-developers
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&op=click">http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

RE: Glitches in RRD data

Christian Magnusson
In reply to this post by Andrew Thoms

BTW: Take a look at the "Evolutionary Changes" chart to see how much
ram & flash there are in the different versions:

http://www.linksysinfo.org/modules.php?name=Content&pa=showpage&pid=6


On Thu, 2005-06-16 at 18:26 +1000, Andrew Thoms wrote:

> Hi,
>
> I figure the easiest is to do a scp of the rrd file just before and after the upgrade. That works fine for a planned reboot, but is no good in a power failure... A 12v gel cell battery could probably provide a crude ups for the wrt54g. Has anyone seen any projects to add memory to the wrt54g? That way we could increase the /opt file system and store the rrd file there.
>
> Andrew Thoms
>
> Electrical Engineer
>
> Silex Systems Limited
> Lucas Heights Science & Technology Centre
> Building 64, New Illawarra Road
> LUCAS HEIGHTS NSW 2234
> AUSTRALIA
>
> Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> www.silex.com.au
>  
>
> This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
>
>
>
> -----Original Message-----
> From: Christian Magnusson [mailto:[hidden email]]
> Sent: Thursday, 16 June 2005 6:20 PM
> To: owfs-developers
> Cc: Andrew Thoms
> Subject: RE: [Owfs-developers] Glitches in RRD data
>
>
> You definitely need to update owfs to the latest version since you
> are running out of memory in the router.
>
> If owfs/owhttpd/owserver is used with the "-s" flag, ServerDir() will
> be called and there will be a memory leak.
>
> Some process in the router (process_monitor), will kill
> the largest process if it's running out of ram, and in this case it's
> certainly the main or child-process to owfs. If the process is killed
> I guess temploggerd will be pretty confused when reading from /var/1wire
> and the result will probably be wrong.
>
> I have uploaded a new compiled version of the firmware (version 1.1b)
> and this doesn't seem to have any memory leaks.
>
>
> Have you any ideas how to save the rrd-file after upgrading the router
> with new firmware, or if it's restarted?  The file is too big to save
> in /opt and it's not fun to loose all data when it's several weeks
> with saved graphs.
> I guess the only way is to save it on some other computer frequently and
> then restore the rrd-file after rebooting the router. Any ideas?
>
> /Christian
>
>
>
> On Thu, 2005-06-16 at 08:58 +1000, Andrew Thoms wrote:
> > Hi Christian,
> >
> > I am using the first flashable image you put online...
> > owfs_wrt54g-firmware-0.1
> >
> > The first serial port (/dev/tts/0) is set aside as the console, I can plug that into a terminal and get the normal kernel output at boot time....
> >
> > I was plugging into the second port for 1-wire... I just have been too lazy to debug it so far. I haven't even plugged a terminal into it to check yet... most likely a pin 2,3 swapped problem or something simple like that.
> >
> > I am not running the owserver process on the wrt54g, but here is the list of ow related things that are running...
> >
> > ewrt ~# ps
> >   PID  Uid     VmSize Stat Command
> >    55 root        168 S   httpd
> >   120 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >   136 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >   137 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 16070 root        676 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> >  4182 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> >  4183 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> >  4206 root        160 S   owhttpd -P /var/run/owhttpd.pid -s
> > 25005 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25021 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25022 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 25025 root        360 S   owfs -P /var/run/owfs.pid -s 192.168.2.2:3002
> > 22827 root        188 S   temploggerd -c /opt/temploggerd.conf -P
> > 22832 root        168 S   httpd2 -p 8004 -h /var/www
> >
> > Strangely enough, when I went to make higher res graphs last night, the problems have seemed to fix themselves... there is just under 1 day of "hairy" data and now it is smooth again... if it does it again, I will bother then... it's a bit too hard to debug when it has disappeared...
> >
> > I have noticed that if I am ssh'd into the router, and I view one of the temploggerd web pages, it sometimes (about 50% of the time) kicks me off the ssh connection with a simple "Connection to router closed." Have you seen the same thing? More likely to be a dropbear thing that templogger related I guess...
> >
> > The pool sensors are encapsulated in 6mm stainless tubes about 1/2 a foot long filled with epoxy. These are my second generation as the first ones failed in less than a week. The salt water of the pool seeped into the sensor legs and one of them became "sacrificial". The second generation sensors have been installed for about six months with no problem at all.
> >
> > There are some out of memory messages in /var/log/messages, at first glance, these don't appear to correlate to the spikes in the ow data... here is a sample of the error messages...
> >
> > Jun 13 07:18:35 Router user.err kernel: Out of Memory: Killed process 28187 (process_monitor).
> > Jun 13 07:30:06 Router user.err kernel: Out of Memory: Killed process 29146 (process_monitor).
> > Jun 13 07:31:17 Router syslog.info -- MARK --
> > ...
> > Jun 13 14:08:00 Router cron.info cron[2116]: (root) CMD (/sbin/check_ps)
> > Jun 13 14:08:53 Router user.err kernel: Out of Memory: Killed process 1632 (process_monitor).
> > Jun 13 14:20:02 Router user.err kernel: Out of Memory: Killed process 2156 (process_monitor).
> > Jun 13 14:29:36 Router user.err kernel: Out of Memory: Killed process 2284 (process_monitor).
> > Jun 13 14:49:59 Router user.err kernel: Out of Memory: Killed process 2438 (process_monitor).
> > Jun 13 15:00:04 Router user.err kernel: Out of Memory: Killed process 2702 (process_monitor).
> > Jun 13 15:09:51 Router user.err kernel: Out of Memory: Killed process 2878 (process_monitor).
> > Jun 13 15:10:00 Router cron.info cron[2980]: (root) CMD (/sbin/check_ps)
> >
> >
> > Thanks for all the effort you put into owfs. It is an excellent project and has saved me massive amounts of time.
> > Andrew
> > Electrical Engineer
> >
> > Silex Systems Limited
> > Lucas Heights Science & Technology Centre
> > Building 64, New Illawarra Road
> > LUCAS HEIGHTS NSW 2234
> > AUSTRALIA
> >
> > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > www.silex.com.au
> >  
> >
> > This email contains information which is CONFIDENTIAL and that may be subject to LEGAL PRIVILEGE. If you are not the intended recipient, you must not read, use, disseminate, distribute or copy this email or its attachments. If you have received this in error, please inform us immediately by return email, facsimile or by telephone and delete this email.
> >
> >
> >
> > -----Original Message-----
> > From: Christian Magnusson [mailto:[hidden email]]
> > Sent: Tuesday, 14 June 2005 5:22 PM
> > To: owfs-developers
> > Cc: Andrew Thoms
> > Subject: Re: [Owfs-developers] Glitches in RRD data
> >
> >
> > Isn't one of the serial ports allocated to the kernel-console by
> > default? I'm not sure about it when using the WRT54G router, but
> > it's worth making sure it's not the case.
> >
> > BTW: Do you use the latest owfs_wrt54g-firmware-1.0 ?
> >
> > Could you please login to your router and make a simple 'ps'.
> > Does any of your owserver or owfs processes take more then
> > 1000kb memory?  Also check 'dmesg' if there are any
> > 'out of memory' messages.
> > This could perhaps be one reason to why it's reading wrong
> > temperatures. I have recently noticed there are some memory
> > leak in owserver, at least when I read the router-sensors from
> > a another server.
> >
> > The spikes you get in the graphs might also be there because of
> > major read-problems on the 1-wire bus... Perhaps water on the
> > pool-sensor connectors?
> >
> > /Christian
> >
> >
> >
> > On Tue, 2005-06-14 at 11:04 +1000, Andrew Thoms wrote:
> > > Hi,
> > >
> > >  
> > >
> > > I have been using owfs at home for a while now I am most impressed
> > > with how easy it has been to setup and use. I wrote a pic interface to
> > > 1-wire for a uni project a few years ago and I had the sensors lying
> > > around doing nothing so I installed owfs and I haven't looked back...
> > >
> > >  
> > >
> > > I recently purchased a WRT54G and proceeded to add two serial ports to
> > > it before I even thought of using it...
> > >
> > >  
> > >
> > > I had some troubles getting it to discover the one wire network (I
> > > think it might be a problem with the second serial port I haven't
> > > bothered to check it yet though).
> > >
> > > I sidestepped the problem by plugging the 1-wire into my redhat 9 box
> > > with owserver and setting up the WRT54G to access the bus remotely.
> > >
> > > This has been running for a few weeks now without any problems... I can
> > > check the rrd graphs from work, and dream of being at home in the
> > > swimming pool (well maybe not when it is 14ºc...)
> > >
> > >  
> > >
> > > Yesterday it started to have glitches in the temperature readings; all
> > > the sensors seem to have coincidental spikes or dips in their values
> > > at random intervals.
> > >
> > >  
> > >
> > > I have attached some of the graphs from this morning.
> > >
> > > The weekly data shows how it has been smooth all week and now it looks
> > > all hairy.
> > >
> > > (even though I have been using it for a few weeks, the data is only
> > > there since the last reboot of the WRT54G)
> > >
> > >  
> > >
> > >  
> > >
> > > Has anyone seen a similar thing, or have any ideas where to start
> > > looking for the problem...
> > >
> > >  
> > >
> > > Andrew Thoms
> > >
> > > Electrical Engineer
> > >
> > > Silex Systems Limited
> > > Lucas Heights Science & Technology Centre
> > > Building 64, New Illawarra Road
> > > LUCAS HEIGHTS NSW 2234
> > > AUSTRALIA
> > >
> > > Ph: + 61 2 9532 1331 Fx: + 61 2 9532 1332
> > > www.silex.com.au
> > >  
> > >
> > > This email contains information which is CONFIDENTIAL and that may be
> > > subject to LEGAL PRIVILEGE. If you are not the intended recipient, you
> > > must not read, use, disseminate, distribute or copy this email or its
> > >   attachments. If you have received this in error, please inform us
> > > immediately by return email, facsimile or by telephone and delete this
> > >                                 email.
> > >
> > >  
> > >
> > >
> > --
> > Christian Magnusson <[hidden email]>
> >
> >
> >
> > -------------------------------------------------------
> > SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> > from IBM. Find simple to follow Roadmaps, straightforward articles,
> > informative Webcasts and more! Get everything you need to get up to
> > speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk">http://ads.osdn.com/?ad_idt77&alloc_id492&opÌk
> > _______________________________________________
> > Owfs-developers mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/owfs-developers
--
Christian Magnusson <[hidden email]>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. <a href="http://ads.osdn.com/?ad_idt77&alloc_id492&op=click">http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Glitches in RRD data

Marco Rogantini-2
In reply to this post by Vadim Tkachenko
On Mon, 13 Jun 2005, Vadim Tkachenko wrote:

> Anyway, if my eyes don't betray me, I see two kinds of anomalies - positive
> and negative shifts. It seems that the positive shifts affect all the sensors
> - check if your power is OK, and I've seen this kind of behavior if you're
> polling them too fast.

The temperature of the sensor increases when you trigger the conversion,
because it converts electrical energy in thermal energy. If you don't
leave enough time between successive conversion there is not enough time
for dissipation.

        -marco


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Glitches in RRD data

Gus S Calabrese
I apologize if this has been mentioned before.
Gluing some aluminum stock ( tightly ) to the sensor will increase
it's thermal mass and improve the speed at which the sensor
returns to the ambient temperature.


On Jun 16, 2005, at 8:07 AM, Marco Rogantini wrote:

On Mon, 13 Jun 2005, Vadim Tkachenko wrote:


> Anyway, if my eyes don't betray me, I see two kinds of anomalies -  
> positive
> and negative shifts. It seems that the positive shifts affect all  
> the sensors
> - check if your power is OK, and I've seen this kind of behavior if  
> you're
> polling them too fast.
>

The temperature of the sensor increases when you trigger the conversion,
because it converts electrical energy in thermal energy. If you don't
leave enough time between successive conversion there is not enough time
for dissipation.

     -marco


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers





-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers