1451-1500 of 3235 results (21ms)
2019-04-07 §
16:54 <zhuyifei1999_> tools-sgeexec-0928 unresponsive since around 22 UTC. No data on Graphite. Can't ssh in even as root. Hard rebooting via Horizon [tools]
01:06 <bstorm_> cleared E state from 6 queues [tools]
2019-04-05 §
15:44 <bstorm_> cleared E state from two exec queues [tools]
2019-04-04 §
21:21 <bd808> Uncordoned tools-worker-1013.tools.eqiad.wmflabs after reboot and forced puppet run [tools]
20:53 <bd808> Rebooting tools-worker-1013 [tools]
20:50 <bd808> Draining tools-worker-1013.tools.eqiad.wmflabs [tools]
20:29 <bd808> Released floating IP and deleted instance tools-checker-01 via Horizon [tools]
20:28 <bd808> Shutdown tools-checker-01 via Horizon [tools]
20:17 <bd808> Repooled tools-webgrid-lighttpd-0906 after reboot, apt-get dist-upgrade, and forced puppet run [tools]
20:13 <bd808> Hard reboot of tools-sgewebgrid-lighttpd-0906 via Horizon [tools]
20:09 <bd808> Repooled tools-webgrid-lighttpd-0912 after reboot, apt-get dist-upgrade, and forced puppet run [tools]
20:05 <bd808> Depooled and rebooted tools-sgewebgrid-lighttpd-0912 [tools]
20:05 <bstorm_> rebooted tools-webgrid-lighttpd-0912 [tools]
20:03 <bstorm_> depooled tools-webgrid-lighttpd-0912 [tools]
19:59 <bstorm_> depooling and rebooting tools-webgrid-lighttpd-0906 [tools]
19:43 <bd808> Repooled tools-sgewebgrid-lighttpd-0926 after reboot, apt-get dist-update, and forced puppet run [tools]
19:36 <bd808> Hard reboot of tools-sgewebgrid-lighttpd-0926 via Horizon [tools]
19:30 <bd808> Rebooting tools-sgewebgrid-lighttpd-0926 [tools]
19:28 <bd808> Depooled tools-sgewebgrid-lighttpd-0926 [tools]
19:13 <bstorm_> cleared E state from 7 queues [tools]
17:32 <andrewbogott> moving tools-static-12 to cloudvirt1023 to keep the two static nodes off the same host [tools]
2019-04-03 §
11:22 <arturo> puppet breakage in due to me introducing openstack-mitaka-jessie repo by mistake. Cleaning up already [tools]
2019-04-02 §
12:11 <arturo> icinga downtime toolschecker for 1 month T219243 [tools]
03:54 <bd808> Added etcd service group to tools-k8s-etcd-* (T219243) [tools]
2019-04-01 §
19:44 <bd808> Deleted tools-checker-02 via Horizon (T219243) [tools]
19:43 <bd808> Shutdown tools-checker-02 via Horizon (T219243) [tools]
16:53 <bstorm_> cleared E state on 6 grid queues [tools]
14:54 <andrewbogott> moving tools-static-12 to eqiad1-r (for real this time maybe) [tools]
2019-03-29 §
21:13 <bstorm_> depooled tools-sgewebgrid-generic-0903 because of some stuck jobs and odd load characteristics [tools]
21:08 <bd808> Updated cherry-pick of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ on tools-puppetmaster-01 (T219243) [tools]
20:48 <bd808> Using root console to fix broken initial puppet run on tools-checker-03. [tools]
20:32 <bd808> Creating tools-checker-03 with role::wmcs::toolforge::checker (T219243) [tools]
20:24 <bd808> Cherry-picked https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/500095/ to tools-puppetmaster-01 for testing (T219243) [tools]
20:22 <bd808> Disabled puppet on tools-checker-0{1,2} to make testing new role::wmcs::toolforge::checker easier (T219243) [tools]
17:25 <bd808> Cleared the "Eqw" state of 44 jobs with `qstat -u '*' | grep Eqw | awk '{print $1;}' | xargs -L1 sudo qmod -cj` on tools-sgegrid-master [tools]
17:16 <andrewbogott> aborted move of tools-static-12; will wait until tomorrow and give DNS caches more time to update [tools]
17:11 <bd808> Restarted nginx on tools-static-13 [tools]
16:53 <andrewbogott> moving tools-static-12 to eqiad1-r [tools]
16:49 <bstorm_> cleared E state from 21 queues [tools]
14:34 <andrewbogott> moving tools-static.wmflabs.org to point to tools-static-13 in eqiad1-r [tools]
13:54 <andrewbogott> moving tools-static-13 to eqiad1-r [tools]
2019-03-28 §
01:00 <bstorm_> cleared error states from two queues [tools]
00:23 <bstorm_> T216060 created tools-sgewebgrid-generic-0901...again! [tools]
2019-03-27 §
23:35 <bstorm_> rebooted tools-paws-master-01 for NFS issue T219460 [tools]
14:45 <bstorm_> cleared several "E" state queues [tools]
12:26 <gtirloni> truncated exim4/paniclog on tools-sgewebgrid-lighttpd-0921 [tools]
12:25 <gtirloni> truncated exim4/paniclog on tools-sgecron-01 [tools]
12:15 <arturo> T218126 `aborrero@tools-sgegrid-master:~$ sudo qmod -d 'test@tools-sssd-sgeexec-test-2'` (and 1) [tools]
2019-03-26 §
22:00 <gtirloni> downtimed toolschecker [tools]
17:31 <arturo> T218126 create VM instances tools-sssd-sgeexec-test-[12] [tools]