Concurrency issues?

classic Classic list List threaded Threaded
31 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Concurrency issues?

Guy COLIN
Hello,
I use owfs since 2010. I had it running on an old pentium 133MHz PC driving
my heating system in the previous house. I then switch to a Raspberry Pi.
Then moved to a new house. In my new house I have 2 Raspberry Pi running
owfs. 1 is driving the heating system + various temp, humidity, lcd. The
second one monitors the swiming pool and drive the pump.
Everything can be seen at my web site gcolin.hd.free.fr
So the first thing I have to say (again) is a big thank to the developers.
Because owfs is just awesome! Making 1wire life easy ;-) Many thanks for
this.
I am happy with my systems they are running almost perfectly.
Why almost? Because sometimes I loose devices. I have been fighting with this
for a very long time. First it was happening on the swimming pool system last
year, I was accusing the hardware so I replaced the 2482 i2c bus master with
a serial one, same problem, then a hobby-board hub, same problem, then
changing the wiring same problem. I was banging my head against the wall. It
was happening randomely, let's say once a week. Then since 6 months without
any change (I'm back to 2482-100 bus master) I don't loose any devices. My
trends are perfectly logged (I log every 5 minutes).
But since a few months it has started on the other system! I lost a temp
sensor. This system has a 2482-800 bus master (i2c -> 8 channels 1wire), I
have separated this sensor to a bus where it is alone: same problem, changed
its cable (because it was a pair in the same cable than another 1wire bus)
same problem again. I decided to connect it to an arduino and have this
arduino send the data to the pi by serial. It works fine but guess what? owfs
is now loosing another sensor.... :-(
Finally I decided to come here to do some reading and I see you are saying: "
don't use the owfs-fuse interface for anything but demonstration purposes, as
it has concurrency issues."
Is this my problem? What concurrency issues are you talking about? Every
owfs? or when owfs is used in conjonction with the kernel 1wire module (I
never tried it)?
My 2 systems are identical: Raspbian wheezy in full read only (I don't want
to upgrade I want to keep it fixed when it's running no change to anything).
Running owfs from Raspbian repo: 2.8p15.
I use owserver and owfs fuse interface daemons (no owftpd, no owhttpd),
access all my devices thru /mnt/1wire/ did this for 6 years. Is that wrong?
Should I use owread, owwrite, owdir, etc?
Please, if you can give me some light on this concurrency issues.
Thanks a lot.
--
Guy COLIN


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Steinar Midtskogen
Guy COLIN <[hidden email]> writes:

> Why almost? Because sometimes I loose devices. I have been fighting with this
> for a very long time. First it was happening on the swimming pool system last
> year, I was accusing the hardware so I replaced the 2482 i2c bus master with
> a serial one, same problem, then a hobby-board hub, same problem, then
> changing the wiring same problem. I was banging my head against the wall. It
> was happening randomely, let's say once a week. Then since 6 months without
> any change (I'm back to 2482-100 bus master) I don't loose any devices. My
> trends are perfectly logged (I log every 5 minutes).

Do they reapper if you restart owfs?

--
Steinar

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
In reply to this post by Guy COLIN
Am 22.05.2016 um 18:06 schrieb Guy COLIN:
>
> Finally I decided to come here to do some reading and I see you are
> saying: " don't use the owfs-fuse interface for anything but
> demonstration purposes, as it has concurrency issues." Is this my
> problem?
>
Maybe. There are a number of other possible problems when devices
seem to disappear from out of sudden.


> I use owserver and owfs fuse interface daemons (no owftpd, no
> owhttpd), access all my devices thru /mnt/1wire/ did this for 6
> years. Is that wrong?
>
> Should I use owread, owwrite, owdir, etc?
>
You should first check if your problem vanishes when you use

$ owdir /uncached

instead of of

$ ls /tmp/ow/uncached


Kind regards

        Jan

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hello

Thanks very much for so quick answers.

@Steinar No usually the lost device(s) don't re-appear after restarting owfs
and owserver. But it may vary. I have done so much tries and couldn't catch
any evidence of doing this or that solve the issue. Now I just wait and the
lost device(s) come back by itself. But it's annoying.

@Jan, Ok I'll do as you suggest. I've just installed ow-shell because it was
not installed on my system. (2.8p15 from Raspbian repo). I now have the
commands owdir, owread etc..
I have launched a small script to monitor the 1wire bus. Usually my bus
report 7 devices, this script will log the output of the commands "ls
/mnt/1wire/uncached/" and "owdir /uncached" when there is no 7 devices. So
we'll be able to see if there is any difference between those 2 commands.
Thanks for giving this path to follow. Here is my script:

#!/bin/sh
# 23 Mai 2016 -- Script to check if any device disappear from 1wire bus

while :
do
  sleep 5  # check every 5 seconds
  HOW_MANY=`ls -ld /mnt/1wire/uncached/??.* | wc -l` # On my system this must
return 7 when no problem
  # echo $HOW_MANY
  if [ $HOW_MANY -eq 7 ]; then
    :  # do nothing
  else    # log the output of "ls /mnt/1wire/uncached " and "owdir /uncached"
(filter for devices directory only)
    echo "==========================================" >> /tmp/LogCheck1wire
    date >> /tmp/LogCheck1wire
    echo "======== ls /mnt/1wire/uncached ==========" >> /tmp/LogCheck1wire
    ls -ld /mnt/1wire/uncached/??.* >> /tmp/LogCheck1wire
    echo "=========== owdir /uncahed ===============" >> /tmp/LogCheck1wire
    owdir /uncached | grep 00 >> /tmp/LogCheck1wire    # On my system all
1wire devices end with 00 so this lists 7 devices
    echo "==========================================" >> /tmp/LogCheck1wire
  fi
done

By the way here is my kernel info:
$ uname -a
Linux Box 3.18.11+ #781 PREEMPT Tue Apr 21 18:02:18 BST 2015 armv6l GNU/Linux

Best regards
--
Guy





------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
Am 23.05.2016 um 16:18 schrieb Guy COLIN:
>
> By the way here is my kernel info:
> $ uname -a
> Linux Box 3.18.11+ #781 PREEMPT Tue Apr 21 18:02:18 BST 2015 armv6l GNU/Linux
>
This means you have to access I²C directly by using --i2c=... and
unload/blacklist the ds2482 kernel module?

Why? Because for Linux-3.16rc1 and later, you need owfs-3.1p1 at least.

Kind regards

        Jan



------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hello,


> >
> This means you have to access I²C directly by using --i2c=... and
> unload/blacklist the ds2482 kernel module?
I have this line in my owfs.conf (owserver and owfs are running as services
thru init.d)
server: device = /dev/i2c-1

i2c module is properly loaded, here is the ouput of lsmod:
Module                  Size  Used by
nfsd                  276601  2
fuse                   92185  3
i2c_dev                 6709  2
snd_bcm2835            21149  0
snd_pcm                90778  1 snd_bcm2835
snd_seq                61097  0
snd_seq_device          7209  1 snd_seq
snd_timer              23007  2 snd_pcm,snd_seq
snd                    66325  5
snd_bcm2835,snd_timer,snd_pcm,snd_seq,snd_seq_device
i2c_bcm2708             6200  0
cdc_acm                18406  2
uio_pdrv_genirq         3666  0
uio                     9897  1 uio_pdrv_genirq

I have these 2 lines in my /boot/config.txt
dtparam=i2c1=on
dtparam=i2c_arm=on


> Why? Because for Linux-3.16rc1 and later, you need owfs-3.1p1 at least.
>
>

Ok I'll update to owfs-3.1p1. But first let's try to see what happen with
this script running. Also it's a pity because owfs 2.8p15 is the standard in
the Raspbian repo, so if it doesn't match the kernel it should be updated. I
believe it the same in Debian main line because Raspbian is basically just a
Debian re-compilation for ARM. So a few years ago I was happy to see that
standard Raspbian + standard owfs from Raspian repo are working "out of the
box". I even posted here about that.

Update:
----------
Up to now my script has caught (almost) nothing. But I'm patient it will
eventually see it. Actually the script has triggered 3 times yesterday but
unfortunately the log doesn't show anything. Why? Because the script is
reading /mnt/1wire/uncached and if there are less (or more) than 7 devices it
reads again /mnt/1wire/uncached and log the result. Unfortunately the log is
showing 7 devices. So what has happened is: first reading there is no 7
devices (most probably 1 has disappeared) but second reading there are 7
devices. And this second reading is logged. So log isn't meaningful.
So I have modified my script to have only one reading, put it in a temporary
file, check it and if no 7 devices log it. Also I check first /mnt/1wire and
trigger the log depending of this directory, then I log also
/mnt/1wire/uncached, the output of "owdir", and the output of "owdir
uncached". I do it this way because on my system I almost never used the
uncached directory. I know the difference but I'm ok this way, I let owserver
update the cache. And each time a device disappear I check in the uncached
directory: it's also disappeared. So this new script is running let's see
what will come from it. It may take days but I'm confident it will see
something.
Here is the useless log (1 out of 3 caught)

Mon May 23 23:04:38 CEST 2016
======== ls /mnt/1wire/uncached ==========
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/10.9702E6010800
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/10.B6FAE5010800
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/26.FFD8F1000000
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/28.4E3066020000
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/29.1C9E09000000
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/3A.546302000000
drwxrwxrwx 1 root root 8 May 23 23:04 /mnt/1wire/uncached/3A.B36002000000
=========== owdir /uncahed ===============
/uncached/10.B6FAE5010800
/uncached/10.9702E6010800
/uncached/3A.546302000000
/uncached/29.1C9E09000000
/uncached/28.4E3066020000
/uncached/3A.B36002000000
/uncached/26.FFD8F1000000
==========================================

And here is the new script:

#!/bin/sh
# 23 Mai 2016 -- Script to check if any device disappear from 1wire bus

while :
do
  sleep 5  # check every 5 seconds
  ls -ld /mnt/1wire/??.* > /tmp/FileTmp1
  ls -ld /mnt/1wire/uncached/??.* > /tmp/FileTmp2
  owdir | grep 00 > /tmp/FileTmp3
  owdir /uncached | grep 00 > /tmp/FileTmp4
  HOW_MANY=`cat /tmp/FileTmp1 | wc -l` # On my system this must return 7 when
no problem
  # echo $HOW_MANY
  if [ $HOW_MANY -eq 7 ]; then
    :  # do nothing
  else    # log the output of above commands (filter for devices directory
only)
    echo "================START=====================" >> /tmp/LogCheck1wire
    date >> /tmp/LogCheck1wire
    echo "=========== ls /mnt/1wire/ ===============" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp1 >> /tmp/LogCheck1wire
    echo "======== ls /mnt/1wire/uncached ==========" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp2 >> /tmp/LogCheck1wire
    echo "=============== owdir ====================" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp3 >> /tmp/LogCheck1wire
    echo "=========== owdir /uncahed ===============" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp4 >> /tmp/LogCheck1wire
    echo "=================END======================" >> /tmp/LogCheck1wire
  fi
done


Best regards

--

Guy




------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Andy Carter
On Tuesday 24 May 2016 16:02:42 Guy COLIN wrote:
> Ok I'll update to owfs-3.1p1. But first let's try to see what happen with
> this script running. Also it's a pity because owfs 2.8p15 is the standard
> in  the Raspbian repo, so if it doesn't match the kernel it should be
> updated. I believe it the same in Debian main line because Raspbian is
> basically just a Debian re-compilation for ARM. So a few years ago I was
> happy to see that standard Raspbian + standard owfs from Raspian repo are
> working "out of the box". I even posted here about that.

You are not alone wishing for an up to date debian/raspbian version, however
the various parts of 3.1p1-5 owfs are available from
http://archive.raspbian.org/raspbian/pool/main/o/owfs/
(you can even install the debian armhf version on a raspbian install)

Andy

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Martin Patzak (GMX)


On 05/24/2016 07:01 PM, Andy Carter wrote:
> You are not alone wishing for an up to date debian/raspbian version,
> however the various parts of 3.1p1-5 owfs are available from
> http://archive.raspbian.org/raspbian/pool/main/o/owfs/
Hey Andy,

thanks for the link!

Cheers

Martin


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
In reply to this post by Guy COLIN
Am 24.05.2016 um 18:02 schrieb Guy COLIN:
>
> Ok I'll update to owfs-3.1p1.
>
Ah, for the 3.16rc and later kernels that's only necessary if you use
--w1 instead of accessing I²C directly. Sorry for the confusion. The
ds2482 kernel module is not loaded, so it seems all ok.

However, there are a number of other bugs fixed in 3.1p1 as well, so
updating may a good decision anyways.


> I know the difference but I'm ok this way, I let owserver
> update the cache.
>
Oh, you misunderstood. Reading from /uncached means the cache is not
considered, but instead a fresh sample is taken.

The cache is *always* updated after taking a sample.


> And each time a device disappear I check in the uncached
> directory: it's also disappeared.
>
You cannot debug the bus this way. If you don't read *uncached*, owfs
doesn't do anything on the bus but read the cache instead until it
expires. So you get a one new directory sample every 60 seconds.

No wonder you have so few errors.


Kind regards

        Jan

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hello,

Jan Kandziora <jjj <at> gmx.de> writes:
> Ah, for the 3.16rc and later kernels that's only necessary if you use
> --w1 instead of accessing I²C directly. Sorry for the confusion. The
> ds2482 kernel module is not loaded, so it seems all ok.
>
> However, there are a number of other bugs fixed in 3.1p1 as well, so
> updating may a good decision anyways.
>

Yes I'll update. But first I want to try to understand what's going on and
let my script running for a few days.

> Oh, you misunderstood. Reading from /uncached means the cache is not
> considered, but instead a fresh sample is taken.
>
> The cache is *always* updated after taking a sample.
Ok, I wasn't clear in my explanantions. I understand that the fact of reading
from uncached updates the cache. I mean that in my system I do everything
with /mnt/1wire. I don't need to be fast. The cache is updated every minute
that's ok for me. My system basically does 2 things: 1st log values (temp,
humidity, etc.) every 5 minutes and generate graph from these values and 2nd
drives the heating system and the swimming pool filter pump. When I write a 0
or a 1 to a PIO thru /mnt/1wire owfs does it instantaneously as described in
the doc (doesn't wait 60sec) I don't have to write to uncached. So that's ok.
Anyway in these applications 1 minutes is nothing, I have an hysteresis of
0.1°C in my home heating system. Example the set point is 20°C the heating
start order will be given when the measured value is <19.9 and stop when
>20.1, so it will heat during something like 15 minutes. So I'm ok by doing
everything with the cached values. But ok to debug we have to understand if
there is any discrepancies   between cached / uncached and more than that
between mnt/1wire and owdir.

> You cannot debug the bus this way. If you don't read *uncached*, owfs
> doesn't do anything on the bus but read the cache instead until it
> expires. So you get a one new directory sample every 60 seconds.
I mean I have seen many times that when a device disappear from /mnt/1wire it
also disappear from /mnt/1wire/uncached. But it can be very strange I have
links like TempSensorRoom pointing to the corresponding sensor. If the sensor
disappear from the directory sometimes, not always, I can still read it thru
TempSensorRoom/temperature


> No wonder you have so few errors.
>
I used to have no errors at all. And now yes it's a few but it can increase
anytime. The thing is I hate to have "holes" in my graph. By the way I'm
wondering if these problems have appeared after having updated Raspbian (and
thus the kernel). I used to keep the system up to date, (I have stopped since
I went read only in August last year). But 3 year ago I had no any lost
measure, it was 100% ok. So may be the problems have started after a kernel
update.. Can't be sure, can't remember.

Anyway I monitor my script outputs, let's see what's coming.

Best reagards
--
Guy


------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
Am 25.05.2016 um 19:01 schrieb Guy COLIN:
>
> Ok, I wasn't clear in my explanantions. I understand that the fact of reading
> from uncached updates the cache. I mean that in my system I do everything
> with /mnt/1wire. I don't need to be fast. The cache is updated every minute
> that's ok for me.
>
The cache for a certain directory is updated when you read that
directory (and it's not taken from the cache). There isn't an automatic
update.



> My system basically does 2 things: 1st log values (temp,
> humidity, etc.) every 5 minutes and generate graph from these values and 2nd
> drives the heating system and the swimming pool filter pump. When I write a 0
> or a 1 to a PIO thru /mnt/1wire owfs does it instantaneously as described in
> the doc (doesn't wait 60sec)
>
In that case, the directory cache is never updated because you never
sample the directory. That's okay for OWFS as it doesn't make
assumptions about the chips connected. Aside from the cache, it's a
really "raw" driver.

Writing doesn't affect the cache. The cache is only about reading
values. (There is an exception: there are properties marked as
reads-as-written, but I'd have to investigate how the cache kicks in there.)


> Anyway in these applications 1 minutes is nothing, I have an hysteresis of
> 0.1°C in my home heating system. Example the set point is 20°C the heating
> start order will be given when the measured value is <19.9 and stop when
>> 20.1, so it will heat during something like 15 minutes. So I'm ok by doing
> everything with the cached values.
>
When you sample that slowly, the cache is always expired (it's 15s for
properties and 60s for directories), so /uncached won't change anything.


> I mean I have seen many times that when a device disappear from /mnt/1wire it
> also disappear from /mnt/1wire/uncached. But it can be very strange I have
> links like TempSensorRoom pointing to the corresponding sensor. If the sensor
> disappear from the directory sometimes, not always, I can still read it thru
> TempSensorRoom/temperature
>
If the cache isn't expired that's perfectly normal. OWFS doesn't connect
the directory listing to any other property.


And even if the cache is expired (or /uncached is read), you may access
chips which haven't been listed in the previous directory sampling. That
is because chips may come and go anytime they want and you shouldn't be
required to "Search ROM" for them if you know them already.

In short: OWFS doesn't make any assuption about the chips present on the
bus. Each transaction starts with a clean slate.


Kind regards

        Jan

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Jan,

thanks for your explanations. How the cache works is clearer for me now.
So if I know my devices ROM number I can don't care about the directory and
access them directly. It's actually what I'm doing.
But still sometimes the device doesn't answer, and this is what I'm trying to
troubleshoot.
Anyway, first sorry to be late to give an update but I'm too busy somewhere
else.
So my script have been running 61 hours and have catched a lot of errors.
I haven't finished to investigate these errors (to sort, filter, understand,
get something from them). I'll post again later.
What I can say now is:
in 61hours (approximately) the script did 43920 tests (every 5 seconds)
the script triggers when there is no 7 devices in /mnt/1wire (more or less
than 7) the script has triggered 108 times
so 108/43920 is 0.25%
It means the directory /mnt/1wire has been OK 99,75% of the time, not bad!
Then in these 108 errors I have different type of errors the most common is
that my humidity sensor (family 26 DS2438) disappear from /mnt/1wire but
there are others as well sometimes my LCD (family 29 DS2408) appears 2 or 3
times in /mnt/1wire. Sometimes, more rarely, it's inside the uncached or the
output of owdir or owdir uncached where a device is missing.
Anyway it's worth to mention that these errors have not been annoying during
all this time because my graphs are perfect, no measure missing. But I
doesn't mean I will not miss anything in the next days. I'll keep watching.
I'll keep posting here.

Best regards and good weekend everybody
--
Guy


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hello,

this is an update. I have sorted, filtered all data got previously.
in 61hours 108 triggerings. So as already said it means my 1wire device
directories were ok 99.75% of the time and wrong 0.25%.
When they are wrong the errors are the following

108 errors in /mnt/1wire
50% missing humidity sensor 26.FFD8F1000000 (DS2438)
50% LCD driver appears twice or even 3 times 29.1C9E09000000 (DS2408)

at same time 32 errors in /mnt/1wire/uncached or "owdir" output or "owdir
uncached" output
most common errors are
missing humidity sensor 26.FFD8F1000000 (DS2438)
missing LCD driver 29.1C9E09000000 (DS2408)
LCD driver appears twice or even 3 times 29.1C9E09000000 (DS2408)

My monitoring script is still running.
However it's worth to say that during all this time I had no missing measure.
My graph are ok.
Also I think that monitoring the directories isn't the best method.
I'm going to use new script to read the devices and check if some are
missing.
It will take sometime, no problem I'm patient ;-)
I'll keep updating here.

Best regards
--
Guy


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
Am 30.05.2016 um 17:27 schrieb Guy COLIN:

>
> this is an update. I have sorted, filtered all data got previously.
> in 61hours 108 triggerings. So as already said it means my 1wire device
> directories were ok 99.75% of the time and wrong 0.25%.
> When they are wrong the errors are the following
>
> 108 errors in /mnt/1wire
> 50% missing humidity sensor 26.FFD8F1000000 (DS2438)
> 50% LCD driver appears twice or even 3 times 29.1C9E09000000 (DS2408)
>
Double listing of devices means the "Search ROM" trickery fails. Most
likely because it's very fragile on any bit. No checksums at all.

You don't have a star topology cabling, do you?

Do you have homebrewn slaves in your bus?

If not, how do you power the sensor and the LED driver board? With a
local power supply, or using additional wires of the bus cable? How long
is the bus cable and which cross section the wires have?


> at same time 32 errors in /mnt/1wire/uncached or "owdir" output or
> "owdir uncached" output
>
Okay, so your problem isn't related to the owfs daemon. Please do
further tests with owdir and owread anyways.


> However it's worth to say that during all this time I had no missing measure.
>
Do you bypass the measurement when the directory listing says the device
isn't there? That would explain it.

Kind regards

        Jan



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Stefano Miccoli
In reply to this post by Guy COLIN

On 30 May 2016, at 17:27, Guy COLIN <[hidden email]> wrote:

Also I think that monitoring the directories isn't the best method.
I'm going to use new script to read the devices and check if some are 
missing.
It will take sometime, no problem I'm patient ;-)
I'll keep updating here.

My 1cent of advice:

I have a (python) script monitoring a network of sensors (DS2438) for years.

Of course your mileage may vary, but here are my choices.

1) The list of sensors is hardwired in a configuration file, no device listing or discovery at runtime.
2) Data is logged with rrdtool in RRD databases (one for each sensor) with step=60s.
3) Every 30s all sensors are read and the corresponding RRDs updated. If a read fails, don worry, just skip it.

By doing so you will end up with a RRD database that records every 60s the average of two sensor samples. In the case of a failed read, the logged value will be the one of a single sample, eventually of a previous time step. (Actually the story is a little more complicated… every 60s a primary data point is generated by averaging the valid samples, typically two. What actually is recorded in the database depends on the consolidation function used, but this is another story).

Read errors are so unfrequent that I simply stopped looking at the logs. RRD is not an easy tool, but by properly implemented and tuned it becomes very robust; at an eyeball inspection the graph generated from the RRD databases are flawless: no missing points, no spikes nor outliers.

Stefano

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hi Jan,
thanks for considering my (small) problem.

>Double listing of devices means the "Search ROM" trickery fails. Most
>likely because it's very fragile on any bit. No checksums at all.
No CRC at all? I thought there was a CRC in each ROM number?

>You don't have a star topology cabling, do you?
No everything is daisy chain

>Do you have homebrewn slaves in your bus?
No only commercial devices.

On this Raspi the bus master is a DS-2482-800 That's I2C->1wire 8 channels
Here is my bus listing:
/mnt/1wire/bus.0/bus.0/:
29.1C9E09000000 (DS-2408 LCD driver (by hobby-boards))
/mnt/1wire/bus.0/bus.1/:
3A.546302000000 (DS-2413 PIO start/stop house heating gas boiler)
3A.B36002000000 (DS-2413 PIO start/stop 12V power for LCD)
/mnt/1wire/bus.0/bus.2/:
10.9702E6010800 (18S20 temp sensor inside this Raspi box)
/mnt/1wire/bus.0/bus.3/:
10.B6FAE5010800 (18S20 temp sensor inside humidity sensor case)
26.FFD8F1000000 (DS-2438 + humidity sens. (by hobby-boards))
28.4E3066020000 (18B20 temp. sens. living room Ref. for heating)
/mnt/1wire/bus.0/bus.4/:
/mnt/1wire/bus.0/bus.5/:
/mnt/1wire/bus.0/bus.6/:
/mnt/1wire/bus.0/bus.7/:
You can have pictures of my system here:
http://gcolin.hd.free.fr/SystemDescriptionGujan/Pictures.html
look at part 1 only: the gas boiler

I bought from hobby-boards (now closed) the LCD driver and the humidity
sensor. Doc is here:
LCD:
<a href="http://gcolin.hd.free.fr/SystemDescriptionGujan/docTech/LCD%20Driver%20v3.0%">http://gcolin.hd.free.fr/SystemDescriptionGujan/docTech/LCD%20Driver%20v3.0%
20Schematic.pdf
Humidity sensor:
http://gcolin.hd.free.fr/SystemDescriptionGujan/docTech/Hobby%20board%20Humi
dity%20Temp%20Schematic.png
The longest cable is the one on bus 3: I have a CAT5E network cable 5m or 6m
long going to a junction box where it's connected (soldered) to a 3m
telephone cable of same cross section than CAT5 cable (also twisted pair).
This cable is going to the Hobby-board humidity sensor (26.FFD8F1000000 +
10.B6FAE5010800) connected by RJ45 (hobby-boards standard). Then a 0.4m
cable connected by RJ45 going to temp. sens. living room (28.4E3066020000)
So these 3 devices are daisy chained on bus 3.
In these cables 1 pair is used for 1wire DQ+GND and 1 wire from another pair
is +12V to power the Humidity sensor (power consumption is 2.83mA).
Note that as per the hobby-board doc (link here above) it's possible to use
parasistic or external power. I have tried both. But I got wrong humidity
values with parasistic power (humid. around 10% lower)

Side note: I used to have my external temp. sensor on this bus.3 as well,
but it was connected from the junction box by another 3m cable. So it was a
kind of star topology:
1x5m cable from bus master to junction box then 1x3m (Extern. temp) + 1x3m
(Humid+ Intern. temp). But 3m isn't very long and it has worked fine for
years (months?).
When I started to miss measures I put this external temp sensor on it's own
cable from the bus-master and on it's own bus (bus.7). But it didn't solved
the problem.
This external temp sensor was the most missing. Even with its own cable and
own bus! So a few weeks ago I moved it to an arduino connected by serial to
the raspi, then everything is ok.
But I don't want to keep this. This sensor must go back to the raspi. When
troubleshooting my system I'm planning to connect it back to the raspi to
see if I can catch something.
Note that when I'm talking of missing measure it's usually not a lot, I can
stay for weeks without problem then for a few hours it's missing.

The LCD is another story: it's on its own bus and is located near the Raspi
as you can see on the photos. Bus cable is something like 0.3m
In 2012 I was struggling to have the LCD attached 3 push-buttons working
properly and I found help here on this mailing list and solved the problem.
Thanks again to Patryk
http://comments.gmane.org/gmane.comp.file-systems.owfs.devel/9429
The 12V is going in the same 0.3m cable than the 1wire DQ+GND (we can see it
on the photos)
When not used there is no power to this LCD driver. To use it you push a
button connected to a Raspi GPIO, the Raspi then send an order to a 1-wire
PIO (3A.B36002000000) closing a relay and sending 12V to the LCD.
The LCD backlight goes on and a script starts displaying by scrolling
various measured values (ext. temp., int. temp, humid. etc..) for 60 sec and
stops.
If you press a button during this 60 sec you access to a menu and can change
parameters such as heating temp. set point, etc..
This works perfectly ok, and I'm happy with it. Even if the last
troubleshooting error report is showing some problems sometimes, this isn't
not a sensor and nothing is recorded from it. So it's ok.
Also about the LCD it's worth to mention that in the past years it was
sometimes randomly displaying rubbish charcaters. After months (years?) of
troubleshooting I found that my script and owserver were simultaneously
accessing it and causing the problem.
Adding this line to my owfs.conf solved the problem:
# Guy 8 Janvier 2014 disable cache for stable values (PIO)
# pour bug aleatoire LCD: caracteres aleatoires merdiques
timeout_stable = 0
I found in the owfs doc that by setting this to 0 owfs will never read the
2408 to update the cache.
This definetly solved the random rubbish characters on the LCD.

So mainly the problem I focus on is on bus.3 (humid. + temp. sens.)
As said previously my next step will be to check with owread. It's coming. I
need more time. Keeping updating here.

Best regards
--
Guy


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hi Stefano,

Thanks for your comment.
Actually you are right: errors are so few than I can just forget them. This
is what I'm doing for my hating system control if the reading isn't correct I
use the previous one.
I also use rrdtool to generate my graphs (you can see on my web site
http://gcolin.hd.free.fr/ ), but honestly I never went deep enought to power-
use it. You are right may be I should start with that.
Depending of what measure but when I miss values I'll get a flat trend in my
graph (the value isn't changing). But sometimes it's too long and not
acceptable!
The thing is that it has worked for years without problems!
May be like me getting old ;-)
I log data every 5mn. I also don't need to discover devices at boot or at
anytime, I have links pointing to the devices I know and need. But..
sometimes the device disappear.
Anyway I continue my search.

best regards
--
Guy


------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Jan Kandziora
In reply to this post by Guy COLIN
Am 31.05.2016 um 19:00 schrieb Guy COLIN:
> Hi Jan,
> thanks for considering my (small) problem.
>
>> Double listing of devices means the "Search ROM" trickery fails. Most
>> likely because it's very fragile on any bit. No checksums at all.
> No CRC at all? I thought there was a CRC in each ROM number?
>
The "Search ROM" mechanism can't have a CRC because it is a wired-AND on
the bus line. Neither the host nor the slave know the result of the
ANDed bits in advance.


>> You don't have a star topology cabling, do you?
> No everything is daisy chain
>
Good. So no problems from that.


>> Do you have homebrewn slaves in your bus?
> No only commercial devices.
>
There are commercial devices with homebrewn chips out there.


> On this Raspi the bus master is a DS-2482-800 That's I2C->1wire 8 channels
> Here is my bus listing:
> /mnt/1wire/bus.0/bus.0/:
> 29.1C9E09000000 (DS-2408 LCD driver (by hobby-boards))
> /mnt/1wire/bus.0/bus.1/:
> 3A.546302000000 (DS-2413 PIO start/stop house heating gas boiler)
> 3A.B36002000000 (DS-2413 PIO start/stop 12V power for LCD)
> /mnt/1wire/bus.0/bus.2/:
> 10.9702E6010800 (18S20 temp sensor inside this Raspi box)
> /mnt/1wire/bus.0/bus.3/:
> 10.B6FAE5010800 (18S20 temp sensor inside humidity sensor case)
> 26.FFD8F1000000 (DS-2438 + humidity sens. (by hobby-boards))
> 28.4E3066020000 (18B20 temp. sens. living room Ref. for heating)
> /mnt/1wire/bus.0/bus.4/:
> /mnt/1wire/bus.0/bus.5/:
> /mnt/1wire/bus.0/bus.6/:
> /mnt/1wire/bus.0/bus.7/:
>
Looks okay.


> The longest cable is the one on bus 3: I have a CAT5E network cable 5m or 6m
> long going to a junction box where it's connected (soldered) to a 3m
> telephone cable of same cross section than CAT5 cable (also twisted pair).
>
That's okay, but keep in mind for later installations twisted pair is
BAD for onewire. Twisted pair is only good if the drive is symetrical.
For asymetrical drive as with onewire, it only adds cable length and
capacity.


> In these cables 1 pair is used for 1wire DQ+GND and 1 wire from another pair
> is +12V to power the Humidity sensor (power consumption is 2.83mA).
> Note that as per the hobby-board doc (link here above) it's possible to use
> parasistic or external power. I have tried both. But I got wrong humidity
> values with parasistic power (humid. around 10% lower)
>
The reason why I ask about that is ground lift. If you power e.g. a
display backlight or a solenoid through the bus cable (regardless if you
have separate power wires), you may get several 100mA of current flowing
back to the host through the ground wire, which may lead to a ground
lift of more than 0.5V over the length of the cable.

This can make the devices on the far end of the bus become invisible,
because when they pull "their" GND to 0.2V, on the host side it becomes
0.7V which makes it impossible for the host to "see" that pull to GND
from the slave.

If power is as low as 3mA, you don't have to worry about this.


>
> Side note: I used to have my external temp. sensor on this bus.3 as well,
> but it was connected from the junction box by another 3m cable. So it was a
> kind of star topology:
> 1x5m cable from bus master to junction box then 1x3m (Extern. temp) + 1x3m
> (Humid+ Intern. temp). But 3m isn't very long and it has worked fine for
> years (months?).
>
3m should work.


> When I started to miss measures I put this external temp sensor on it's own
> cable from the bus-master and on it's own bus (bus.7). But it didn't solved
> the problem.
> This external temp sensor was the most missing. Even with its own cable and
> own bus! So a few weeks ago I moved it to an arduino connected by serial to
> the raspi, then everything is ok.
> But I don't want to keep this. This sensor must go back to the raspi. When
> troubleshooting my system I'm planning to connect it back to the raspi to
> see if I can catch something.
> Note that when I'm talking of missing measure it's usually not a lot, I can
> stay for weeks without problem then for a few hours it's missing.
>
Check the cabling. There isn't much more to say to it. Check the cabling.



Kind regards

        Jan

------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hello, this is another update
Today I have stopped the script monitoring the devices in the directories.
As planned and as sugested I've started a new script monitoring the devices
with owread uncached.

Last result of the script monitoring the devices in the directories:
Ran for 146 hours, 291 errors detected
234 errors are missing devices

28.4E3066020000 missing=54  (owdir=9  "owdir uncached"=18 /mnt/1wire=11
/mnt/1wire/uncached=16)
26.FFD8F1000000 missing=146 (owdir=31 "owdir uncached"=42 /mnt/1wire=35
/mnt/1wire/uncached=38)
10.B6FAE5010800 missing=17  (owdir=1  "owdir uncached"=7  /mnt/1wire=0  
/mnt/1wire/uncached=9 )
29.1C9E09000000 missing=17  (owdir=0  "owdir uncached"=10 /mnt/1wire=0  
/mnt/1wire/uncached=7 )

49 errors are double devices
 8 errors are x3 devices
the x2 and x3 are all for device 29.1C9E09000000 (LCD driver) same day same
time (19:20 to 19:21 so 11 minutes)

Anyway theses log are not very meaningfull.
Again no measures missed with my readings every 5 minutes, no holes in my
graphs.

Now I have started this new script (here under).
This shell script uses owread to read the devices in the uncached directory.
It performs check every 5 seconds and it takes 4.3 to 4.8 seconds to read
the 7 devices.
The script check if owread terminates with error and log the error.
I have checked the script by physically disconnectiong a device, it works,
errors are logged.
Any comments on the method or the script are welcome.
I will keep it running for a few days and after that connect again my temp
sensor which is currently connected to an arduino ans was the most faulty
before.
Let's see what's happen.
I've checked the cabling (again), nothing to report, connections are ok.

#!/bin/sh
# 5 Juin 2016 -- Script to check if any device disappear from 1wire bus

while :
do
  sleep 5  # check every 5 seconds
  # This part here under takes almost 5sec (due to access to 1wire bus on 7
devices)
  owread /uncached/10.B6FAE5010800/temperature >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.B6FAE5010800 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/10.9702E6010800/temperature >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.9702E6010800 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/3A.546302000000/sensed.BYTE >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.546302000000 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/29.1C9E09000000/sensed.BYTE >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 29.1C9E09000000 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/28.4E3066020000/temperature >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 28.4E3066020000 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/3A.B36002000000/sensed.BYTE >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.B36002000000 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
  owread /uncached/26.FFD8F1000000/humidity >/tmp/FileTmp 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 26.FFD8F1000000 " >> /tmp/LogCheck1wire
    cat /tmp/FileTmp  >> /tmp/LogCheck1wire
  fi
done

Best regards
--
Guy



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
Reply | Threaded
Open this post in threaded view
|

Re: Concurrency issues?

Guy COLIN
Hi!
Update:
43hours running with "owread uncached" ZERO error!
the script sleeps 5 seconds then read 7 1wire devices, it takes approx
4.5seconds more, then go to sleep again
So approx 1700*7 readings, no error
When the previous script was reading the directories I got a few errors. So
I believe it's the search-rom algorythm vs the read scatchpad doing the
difference.
As mentioned by Jan it's better to use "owread uncahed"
Well at least for now. I keep monitoring. I've modify the script to check
not only the owread errors but also to check if readings are within
acceptable range. Here is the script.
Tomorrow I'll add a temp sensor...
Wait & see

Best regards
--
Guy

#!/bin/sh
# 5 Juin 2016 -- Script to check if any device disappear from 1wire bus

COUNTER=0
while :
do
  COUNTER=$(($COUNTER + 1))
  sleep 5  # check every 5 seconds
  # This part here under takes almost 5sec (due to access to 1wire bus on 7
devices)
  owread /uncached/10.B6FAE5010800/temperature >/tmp/FileTmp1 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.B6FAE5010800 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp1  >> /tmp/LogCheck1wire
  fi
  owread /uncached/10.9702E6010800/temperature >/tmp/FileTmp2 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.9702E6010800 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp2  >> /tmp/LogCheck1wire
  fi
  owread /uncached/3A.546302000000/sensed.BYTE >/tmp/FileTmp3 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.546302000000 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp3  >> /tmp/LogCheck1wire
  fi
  owread /uncached/29.1C9E09000000/sensed.BYTE >/tmp/FileTmp4 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 29.1C9E09000000 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp4  >> /tmp/LogCheck1wire
  fi
  owread /uncached/28.4E3066020000/temperature >/tmp/FileTmp5 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 28.4E3066020000 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp5  >> /tmp/LogCheck1wire
  fi
  owread /uncached/3A.B36002000000/sensed.BYTE >/tmp/FileTmp6 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.B36002000000 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp6  >> /tmp/LogCheck1wire
  fi
  owread /uncached/26.FFD8F1000000/humidity >/tmp/FileTmp7 2>&1
  if [ $? -ne 0 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 26.FFD8F1000000 " >> /tmp/LogCheck1wire
    echo -n "  Counter =$COUNTER" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp7  >> /tmp/LogCheck1wire
  fi
  cat /tmp/FileTmp*
  echo
  VALUE=`cat /tmp/FileTmp1`
  if [ `echo " $VALUE > 10 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 40 " |
bc -l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.B6FAE5010800 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp1  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp2`
  if [ `echo " $VALUE > 10 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 40 " |
bc -l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 10.9702E6010800 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp2  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp3`
  if [ `echo " $VALUE > -1 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 4 " | bc
-l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.546302000000 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp3  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp4`
  if [ `echo " $VALUE > -1 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 256 " |
bc -l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 29.1C9E09000000 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp4  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp5`
  if [ `echo " $VALUE > 10 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 40 " |
bc -l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 28.4E3066020000 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp5  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp6`
  if [ `echo " $VALUE > -1 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 4 " | bc
-l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 3A.B36002000000 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp6  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
  VALUE=`cat /tmp/FileTmp7`
  if [ `echo " $VALUE > 20 " | bc -l` -ne 1 ] || [ `echo " $VALUE < 99 " |
bc -l` -ne 1 ]; then
    echo -n `date` >> /tmp/LogCheck1wire
    echo -n " 26.FFD8F1000000 NOT IN RANGE" >> /tmp/LogCheck1wire
    cat /tmp/FileTmp7  >> /tmp/LogCheck1wire
    echo "  Counter =$COUNTER" >> /tmp/LogCheck1wire
  fi
done



------------------------------------------------------------------------------
What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
patterns at an interface-level. Reveals which users, apps, and protocols are
consuming the most bandwidth. Provides multi-vendor support for NetFlow,
J-Flow, sFlow and other flows. Make informed decisions using capacity
planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e
_______________________________________________
Owfs-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/owfs-developers
12