551-600 of 10000 results (41ms)
2019-02-16 §
16:26 <ariel@deploy1001> Finished deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures (duration: 00m 03s) [production]
16:25 <ariel@deploy1001> Started deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures [production]
14:21 <arturo> T194855 cloudvirt1020 is poweroff, waiting for disk setup before installing [production]
13:59 <arturo> T193264 switched clouddb1001/1004 to the new project local puppetmaster [clouddb-services]
13:54 <arturo> T193264 create 'clouddb10' puppet prefix to store puppet/hiera config for database servers in this project [clouddb-services]
13:47 <arturo> T193264 create 'clouddb-services-puppetmaster' puppet prefix to store puppet/hiera config for this project puppetmaster [clouddb-services]
13:43 <arturo> T193264 create 'clouddb-services-puppetmaster-01' instance [clouddb-services]
13:33 <arturo> add myself as user and projectadmin [clouddb-services]
05:00 <zhuyifei1999_> fixed by restarting flannel. another puppet run simply started kubelet [tools]
04:58 <zhuyifei1999_> puppet logs: https://phabricator.wikimedia.org/P8097. Docker is failing with 'Failed to load environment files: No such file or directory' [tools]
04:52 <zhuyifei1999_> copied the resolv.conf from tools-k8s-master-01, removing secondary DNS to make sure puppet fixes that, and starting puppet [tools]
04:48 <zhuyifei1999_> that host's resolv.conf is badly broken https://phabricator.wikimedia.org/P8096. The last Puppet run was at Thu Feb 14 15:21:09 UTC 2019 (2247 minutes ago) [tools]
04:44 <zhuyifei1999_> puppet is also failing bad here 'Error: Could not request certificate: getaddrinfo: Name or service not known' [tools]
04:43 <zhuyifei1999_> this one has logs full of 'Can't contact LDAP server' [tools]
04:41 <zhuyifei1999_> nslcd also broken on tools-worker-1005 [tools]
04:34 <zhuyifei1999_> uncordon tools-worker-1014.tools.eqiad.wmflabs [tools]
04:33 <zhuyifei1999_> the issue was, /var/run/nslcd/socket was somehow a directory, AFAICT [tools]
04:31 <zhuyifei1999_> then started nslcd vis systemctl and `id zhuyifei1999` returns correct stuffs [tools]
04:30 <zhuyifei1999_> `nslcd -nd` complains about 'nslcd: bind() to /var/run/nslcd/socket failed: Address already in use'. SIGTERMed a background nslcd, `rmdir /var/run/nslcd/socket`, and `nslcd -nd` seemingly starts to work [tools]
04:23 <zhuyifei1999_> drained tools-worker-1014.tools.eqiad.wmflabs [tools]
04:16 <zhuyifei1999_> logs: https://phabricator.wikimedia.org/P8095 [tools]
04:14 <zhuyifei1999_> restarting nslcd on tools-worker-1014 in an attempt to fix that, service failed to start, looking into logs [tools]
04:12 <zhuyifei1999_> restarting nscd on tools-worker-1014 in an attempt to fix seemingly-not-attached-to-LDAP [tools]
00:20 <XioNoX> add port 22 in cloud-in4 term labsdb [production]
2019-02-15 §
23:42 <bd808> Added BryanDavis (self), Arturo Borrero Gonzalez, Marostegui, and Jcrespo as admins in project [clouddb-services]
22:49 <bstorm_> created mariadb security group and lvs for a new database T193264 [clouddb-services]
22:49 <Joan> Restarted CVNBot3 (Last message was received on RCReader 5729.637672 seconds ago) [cvn]
20:40 <andrewbogott> enabled virtualization (all three settings) on cloudvirt1019 [production]
19:41 <arturo> T193264 reimaging cloudvirt1019 to get mitaka/stretch [production]
18:51 <arturo> T193264 icinga downtime cloudvirt1019 for 1 week [production]
18:44 <bstorm_> stopped replication and then mariadb on labsdb1004 [production]
18:18 <nuria> restarted turnilo in analytics-tool1002 [analytics]
17:28 <thcipriani> integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk [releng]
16:52 <cdanis> correction, needed to increment version; adding backported rasdaemon 0.6.0-1.2+deb8u2 to jessie-wikimedia [production]
16:48 <cdanis> adding backported rasdaemon 0.6.0-1.2+deb8u1 to jessie-wikimedia [production]
16:29 <bblack> reprepro: uploaded gdnsd-3.0.0-1~wmf1 to stretch-wikimedia [production]
16:28 <Lucas_WMDE> moved cronjob from trusty to stretch (following [[wikitech:News/Toolforge Trusty Move a cron job]]) [tools.wmde-access]
15:45 <moritzm> rebooting auth1001 for kernel security update [production]
14:50 <moritzm> installing unbound update from stretch point release [production]
14:45 <moritzm> removed labvirt1012 from debmonitor (got renamed to cloudvirt1012) (T216190) [production]
14:06 <moritzm> rebooting mwlog1001 for kernel security update [production]
13:54 <moritzm> rebooting mwlog2001 for kernel security update [production]
13:46 <jbond42> install tar security updates [production]
13:19 <moritzm> rolling reboot of mwdebug servers in eqiad to pick up SSBD-enabled qemu [production]
13:15 <Amir1> migrating the webservice to stretch+k8s [tools.mrmetadata]
13:12 <gtirloni> reboot cloudvirt1020 [production]
13:11 <arturo> T216239 labvirt1019 has been drained of any workload [production]
13:10 <arturo> T216239 labvirt1019 has been drained [admin]
13:06 <moritzm> installing NSS security updates [production]
12:42 <moritzm> installing squid3 security updates [production]