2014-03-31
§
|
21:02 |
<hashar> |
Making Parsoid daemon to write its logs to /data/project/parsoid/parsoid.log {{gerrit|122561}} |
[releng] |
20:17 |
<hashar> |
restarted parsoid daemon |
[releng] |
20:00 |
<hashar> |
stopped parsoid . It is killing the application servers |
[releng] |
19:53 |
<hashar> |
restarting both apaches |
[releng] |
19:21 |
<hashar> |
restarting job service on jobrunner01 to apply {{gerrit|122436}} |
[releng] |
19:20 |
<hashar> |
Unbreak puppetmaster on deployment-salt.eqiad.wmflabs |
[releng] |
19:01 |
<hashar> |
puppet master is broken :( |
[releng] |
17:39 |
<hashar> |
lowering # of jobs spawned by the jobrunner {{gerrit|122436}} |
[releng] |
16:00 |
<bd808> |
Restarted logstash service on deployment-logstash1; no new log events seen since 2014-03-28T10:57 |
[releng] |
15:58 |
<bd808> |
Updated kibana on deployment-logstash1 to e317bc6 |
[releng] |
15:56 |
<hashar_> |
Cluster slow because some CirrusSearch job is spamming simplewiki . Gotta find a way to throttle the number of jobs being run on jobrunner01 or add more apache boxes . It is transient anyway, might look at limiting the runs tonight |
[releng] |
15:10 |
<hashar_> |
Rebased puppet repository. Only one hack left: https://gerrit.wikimedia.org/r/#/c/119534/ |
[releng] |
14:20 |
<hashar> |
deleting deployment-parsoidcache01 cache the hardway: stopping varnish, deleting files in /srv/vdb/ , starting varnish |
[releng] |
14:05 |
<hashar> |
shutdowning database and apache boxes for now. |
[releng] |
14:03 |
<hashar> |
shutdowning varnishes instances in pmtpa |
[releng] |
13:56 |
<hashar> |
Deleted deployment-cache-upload01 , replaced by deployment-cache-upload02 |
[releng] |
13:52 |
<hashar> |
upload varnish cache working :-] |
[releng] |
13:47 |
<hashar> |
applying role::cache::upload to role-cache-upload02 |
[releng] |
13:37 |
<hashar> |
migrating deployment-cache-upload02.eqiad.Wmflabs to self puppet/salt master |
[releng] |
13:22 |
<hashar> |
Creating deployment-cache-upload02 to replace deployment-cache-upload01 which was missing the security group "web" |
[releng] |
11:30 |
<hashar> |
Update DNS entries to point to EQIAD instances (aka switching beta cluster to eqiad) |
[releng] |
2014-03-27
§
|
15:23 |
<hashar> |
role::beta::natfix cant run on deployment-bastion.eqiad because the ferm rules conflicts with the Augeas rules coming from udp2log :-( |
[releng] |
15:21 |
<hashar> |
applying role::beta::natfix on deployment-bastion.eqiad |
[releng] |
14:58 |
<hashar> |
fixed up role::beta::natfix . Ferm is now being applied again on various application server instances {{gerrit|121378}} |
[releng] |
13:58 |
<hashar> |
rebased puppetmaster git repository, reapplied ottomata live hacks. |
[releng] |
12:55 |
<hashar> |
mediawiki l10n cache being rebuild!!! |
[releng] |
12:54 |
<hashar> |
Fixed permissions on eqiad bastion for /srv/scap . Others (such as mwdeploy) could not read / execute scap scripts |
[releng] |
11:29 |
<hashar> |
MediaWiki code and configuration are now self updating on EQIAD cluster via Jenkins jobs. First run: https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/4/console |
[releng] |
11:11 |
<hashar> |
deleting job beta-code-update , replaced by datacenter variants beta-code-update-pmtpa and beta-code-update-eqiad |
[releng] |
10:54 |
<hashar> |
Deleting job beta-update-databases , replaced by datacenter variants beta-update-databases-pmtpa and beta-update-databases-eqiad |
[releng] |